Files
solution-erp/docs/changelog/sessions/2026-05-26-1630-s31-rag-baseline-pass.md
pqhuy1987 1e1c9a2433
All checks were successful
Deploy SOLUTION_ERP / build-deploy (push) Successful in 3m38s
[CLAUDE] Docs: S31 RAG v1.3 baseline PASS (11/11 recall@5=1.000) + gotcha #52
- eval/runs/: baseline v1.1 final PASS after retrieval.py fix (vector search restored)
- eval/trial-state-lock.json: quality_gate.pass=true, baseline=1.000, avg_rerank=0.847
- docs/gotchas.md: +gotcha #52 qdrant-client 1.18 removed search() silent AttributeError
- docs/STATUS.md: S31 entry — RAG PASS, retrieval.py fix, CLI restart required
- docs/HANDOFF.md: S31 brief + CRITICAL CLI restart note
- docs/changelog/sessions/: S31 session log

Root cause: qdrant-client 1.18 removed search() → vec_results always [] → BM25-only
Fix: retrieval.py query_points().points (applied to AI_INFRA repo)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 13:42:04 +07:00

3.7 KiB

Session 31 — RAG v1.3 Baseline PASS + retrieval.py critical fix

Date: 2026-05-26 Duration: ~1.5h Commits: 1 pending (S31 docs/infra) Sub-agents: 0

Summary

Session 31 followed directly from S30 (RAG v1.3 setup). S30 had bootstrapped 2949 chunks but the v1.1 baseline was tentative (PENDING_RELOAD). S31 re-ran the eval after CLI restart and discovered the real root cause.

Symptom: Most golden set queries returned 0 results even with 2949 Qdrant points green.

Root cause: qdrant-client 1.18.0 removed QdrantClient.search() method entirely. retrieval.py:vector_search() called _qdrant.search(...) which raised AttributeError. This exception was silently swallowed by except Exception: continuevec_results = [] always → pipeline fell back to BM25-only.

Diagnosis chain:

  1. q01/q02/q03/q08 worked (BM25 exact match found all tokens in same chunk)
  2. q04/q05/q06/q07/q09/q10/q11 returned 0 (BM25 strict AND failed, vector broken)
  3. Qdrant HTTP /collections/proj_solution_erpstatus=green, points_count=2949
  4. BM25 SQLite FTS: MediatR → 3 hits, fallback → 173 hits, ApprovalWorkflow → 40 hits (data IS there)
  5. python -c "from qdrant_client import QdrantClient; c=QdrantClient(...); c.search" → AttributeError confirmed
  6. Tested query_points(query=...).points → worked, 5 results returned

Fix Applied

File: D:\Dropbox\CONG_VIEC\AI_INFRA\claude-rag\lib\retrieval.pyvector_search() function

# Before (broken in qdrant-client 1.18):
hits = _qdrant.search(
    collection_name=c,
    query_vector=query_vector,
    limit=top_k,
    with_payload=True,
)
for h in hits:
    results.append({"chunk_id": h.id, ...})

# After (qdrant-client 1.12+ query_points API):
resp = _qdrant.query_points(
    collection_name=c,
    query=query_vector,    # param renamed
    limit=top_k,
    with_payload=True,
)
for h in resp.points:    # access .points on response
    results.append({"chunk_id": h.id, ...})

Eval Results

Version recall@5 hits/11 Pass
v1.0 (S30) 0.455 5/11 FAIL
v1.1 tentative (S30 stale MCP) 0.364 4/11 FAIL
v1.1 FINAL (S31 after fix) 1.000 11/11 PASS

avg_top1_rerank: 0.847 (gate threshold 0.65)

Negative query behavior (q12/q13/q14):

  • q12 GraphQL: rerank 0.43 (< 0.7 threshold — not a false positive)
  • q13 Redis: rerank 0.38 — correct exclusion
  • q14 Kubernetes: rerank 0.43 — correct exclusion

Files Updated

AI_INFRA:

  • claude-rag/lib/retrieval.py — vector_search() fix
  • claude-rag/eval_v11.py — temp eval script (not in SOLUTION_ERP)

SOLUTION_ERP:

  • eval/runs/2026-05-26-baseline-v1.1-final.json — official baseline record
  • eval/trial-state-lock.json — quality_gate.pass=true, baseline_recall=1.000
  • docs/gotchas.md — gotcha #52 added
  • docs/STATUS.md — S31 entry added
  • docs/HANDOFF.md — S31 entry added

New Gotcha #52

qdrant-client 1.18 removed search() API. When library upgrades break internal API, silent except Exception: continue in retrieval pipeline produces no error but returns empty results. Fix: use query_points().points. Lesson: pin qdrant-client version OR add health check hasattr(_qdrant, 'query_points').

Next Session

CRITICAL: CLI restart to pick up retrieval.py fix (MCP server still loaded old code in memory).

After CLI restart:

  • Live search_memory will use fixed vector + BM25 pipeline
  • RAG recall@5 = 1.000 confirmed

Other pending:

  • Curate 4 sub-agent MEMORY files (Implementer over threshold)
  • Plan B-Wrap (Contract V2 test bundle BW1-BW7)
  • Phase 9 UAT hard blockers (SMTP, rotate creds, SQL backup, win-acme cert)