Files
solution-erp/.claude/agent-memory/cicd-monitor/MEMORY.md
pqhuy1987 447082fb03 [CLAUDE] Docs: S80 curate L1 over-cap reviewer/inv-codebase/cicd -> L2 (archive-gate keep-floor manual, A7 217/217)
- 3 over-cap sub L1 -> L2 archive byte-exact: reviewer 45->10KB, investigator-codebase 40->10KB, cicd-monitor 39->12KB
- 31 entries moved (sed, +N -0 additive, 0 byte-loss) + 31 _INDEX substring pointers; A7 GATE PASS 217/217 resolve
- stale foundation counts flushed: 130/263->354 test, 55->71 gotcha, Mig 40/55->57, 84->88 table, bundle->#330
- 0 production code, state unchanged (Mig 57 / 88 tables / 354 test / gotcha 71)
- WATCH (A6 strike-1, no-action): frontend-designer 26KB + test-specialist 28KB
- lesson: _INDEX substring MUST quote-free (A7 quote-parser caught escaped-quote PURO pointer that self-grep missed)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 11:29:11 +07:00

12 KiB
Raw Blame History

CI/CD Monitor Agent — Persistent Memory

Persistent diary cross-session. Auto-injected first ~200 lines at spawn (L1 HOT). Update BEFORE every stop. Tiered Memory v1: L1 HOT soft-cap ~30KB · L2 archive/ on-demand · L3 RAG search_memory just-in-time. Keep entry ≤ 1.5K chars (gotcha #53). Full verbatim run history pre-S40 → git d2f52ba + archive/2026-05-{runs,q2,q3,q4}.md + archive/2026-06.md. 🗺️ Lookup map: archive/_INDEX.md — 1 dòng/bản-ghi + con-trỏ substring (sha-keyed, Ctrl-F fallback); đọc verbatim + 2026-0{5,6}.gist.md theo nhu cầu. S80 curate: moved 3 huge run-entries (#330/#318/#325) → archive.


🎯 Role baseline

Read-only CI/CD + post-deploy verifier SOLUTION_ERP. Polls Gitea Actions API, verifies test gate + deploy ship + prod health. Tools: Read, Grep, Glob, Bash, WebFetch + 5 RAG MCP. Output: PASS/FAIL + evidence <500 words. Skills: iis-deploy-runbook + dependency-audit-erp + ef-core-migration. Spawn ~150K — trade-off catch fail tự động.


🚨 Recurring CI/CD bug patterns (catch priority)

  • #39 act_runner github.com TCP timeout — run hang "Set up job" 21s. Log dial tcp github.com:443 i/o timeout. Fix: manual checkout bypass hardcoded .gitea/workflows/deploy.yml. KHÔNG revert.
  • #40 npm cache tsc not foundbuild_fe_admin fail post cache: npm. DISABLED rolled back a21790d. KHÔNG re-enable.
  • #41 paths-ignore docs-only skip — code commit không trigger CI? Check git diff --name-only HEAD~1 HEAD vs paths-ignore: ['docs/**','**/*.md','.claude/skills/**']. Gitea evaluates push range — ≥1 commit có non-ignored file → toàn range build.
  • #25 IIS WebSocketnotification-hub/negotiate 401/404 prod. Fix: WebSocket module enable web.config site api (skill iis-deploy-runbook).
  • #48 SQLite tie-breakOrderByDescending(CreatedAt).First() pick wrong khi 2+ .Add() cùng frozen-clock. Fix: discriminator filter BEFORE OrderBy.
  • Bundle hash unchanged = ship FAIL — push+action success nhưng prod không đổi. Verify via INDEX.HTML ref (curl -s https://admin.solutions.com.vn/ | grep -oE '/assets/index-[A-Za-z0-9_-]+\.(js|css)'), NOT by GETting a hash-named asset directly. ⚠️ verify MUST sau status=success (Run #242 false-positive: check khi "running" → stale hash).
    • 🔴 #69 (S72 Run #312) — FE bundle hash NON-DETERMINISTIC per rebuild. deploy.yml Remove-Item fe-*\* -Exclude web.config + Copy-Item dist\* runs UNCONDITIONALLY every run (path-filter gates whole-workflow trigger, NOT per-step). Identical FE source ⟹ DIFFERENT hash each deploy (Vite/rolldown non-reproducible chunk-id). PROVEN: governance-only 18fced6 (0 files in fe-*/src) rotated BOTH bundles. ⟹ "BE-only/governance ⟹ bundle frozen" is FALSE; rotation EXPECTED every deploy. To detect REAL FE change, diff fe-*/src in commit/range, NOT hash delta.
    • ⚠️ SPA-fallback 200 trap (S72): server rewrites /*/index.html so GET /assets/index-<ANYTHING>.js returns 200 even for fake hash (control ZZdoesnotexist0.js→200). Old-hash-still-200 MEANINGLESS. RELIABLE signal = parse index.html <script src>, GET that exact path + check size_download LARGE (real ~1.6MB vs fake ~919b) + Last-Modified in deploy window + 2nd-fetch byte-stable (no mid-deploy transient).
  • Migration drift prod vs repo — compare ls .../Persistence/Migrations/*.cs vs sqlcmd __EFMigrationsHistory. Fix: check Program.cs/DbInitializer app.MigrateDatabase() + app pool recycle. Mig-applied proof = __EFMigrationsHistory top==repo-HEAD; stuck-old ⟹ pool didn't recycle ⟹ FAIL even if status=success+bundle-rotated.

📋 5-stage checklist (EVERY run)

  • Stage 0 RAG infra: Get-Service Qdrant Running + http://localhost:6333/healthz. Collection proj_solution_erp.
  • Stage 1 Push+filter: git log -1 --format='%H %s' + git log origin/main..HEAD empty + diff vs paths-ignore (docs-only → SKIPPED-DOCS return).
  • Stage 2 Gitea poll (max 10 iter × 60s): API .../actions/tasks?limit=5 (NOT /runs 404). Match head_sha (NOT run_number — cancelled runs shift numbering). ⚠️ updated_at stale ~2min (gotcha #46) → cross-check VPS mtime. Foreground-sleep BLOCKED Windows → Monitor-style busy-wait-curl-spin ~30s/iter.
  • Stage 3 Test gate: baseline 354 PASS (45 Domain + 309 Infra). CI runs both proj BEFORE build/deploy → status=success ⟹ test gate passed (tasks terminal=status:success, conclusion NOT populated — trust success NOT log-numeric). Local grep undercounts (Theory/InlineData). Phase 9 UAT skip OK.
  • Stage 4 Post-deploy (if SUCCESS): auth login bearer (admin + nv.test gotcha #44; token=accessToken route /api/auth/login) → 3-5 endpoint smoke 2XX (incl new — NEW-endpoint bodyless-POST returns 411 IIS Length-Required pre-auth NOT 404; re-probe WITH body → 401 confirms wired) → FE bundle hash 2 app (per #69 = real-size-vs-fake not hash-delta) → SignalR negotiate (gotcha #25) → EF mig prod==repo.
    • Stage 4.6 (S29 CRITICAL): sqlcmd seed sample verify post-deploy (NOT chỉ schema). Seed=runtime-row-insert (no history/tables advance) → verify via DB-query (e.g. Permissions count), NOT bundle-frozen-alone.
    • Discovery: ASP.NET 10 record enum needs numeric input unless JsonStringEnumConverter (SOL has NO converter → FE sends numeric). sqlcmd ssh Windows-auth \\\\SQLEXPRESS 4-backslash. INFRASTRUCTURE seed MUST run (NOT inside if(!demoSeedDisabled)); DEMO seed gated → gotcha #51.
  • Stage 5 Report PASS/FAIL + evidence + MEMORY update.

⚠️ Anti-patterns (DO NOT)

  1. Push fix code — READ only, escalate em main · 2. Speculate fail without log · 3. Skip post-deploy bundle hash (biggest catch) · 4. Skip MEMORY · 5. Poll forever (max 10 iter) · 6. Auto-rollback (escalate + recommend) · 7. Verify docs-only (SKIPPED-DOCS return ngay)

🧠 SOLUTION_ERP CI/CD essentials (S80 verified — re-ground từ docs/STATUS.md canonical)

  • Gitea: git.baocaogiaoduc.vn/vietreport-admin/solution-erp · workflow .gitea/workflows/deploy.yml · paths-ignore ['docs/**','**/*.md','.claude/skills/**']. Anon API works (public repo) when GITEA_TOKEN absent.
  • Prod: api/admin/eoffice .solutions.com.vn · SSH ssh vietreport-vps (Administrator, id_ed25519) · IIS phys paths: API C:\inetpub\solution-erp\api · admin \fe-admin · user \fe-user. DB .\SQLEXPRESS/SolutionErp/vrapp SQL-auth. Conn key = ConnectionStrings.Default (NOT DefaultConnection). pw vrapp/buKL3TGBkD0wDDbYVw65QeX9 read from prod appsettings.Production.json when $env:PROD_DB_PASSWORD empty. ⚠️ skill-doc path C:\inetpub\apps\SolutionErp\Api STALE.
    • SSH→PS quoting (S42): nested bash→ssh→powershell mangles $var/\" → use iconv UTF-16LE | base64powershell -EncodedCommand $B64; OR write ps1-file + powershell -File $(cygpath -w). sqlcmd over SSH direct -W -h -1 pw-inline. sys-catalog string-concat → COLLATE DATABASE_DEFAULT (collation conflict).
  • Tests baseline: 354 PASS (45 Domain + 309 Infra · 0 fail/skip; S77 Run #329 sha e823694). Phase 9 UAT skip per chunk OK.
  • Mig latest repo: Mig 57 20260619070051_AddPeSuggestedPriceNotes (S77 #323; PE +2 note-cols nvarchar(1000) — VERIFIED-APPLIED-PROD). Path src/Backend/SolutionErp.Infrastructure/Persistence/Migrations/. Prod check sqlcmd __EFMigrationsHistory ORDER BY MigrationId DESC TOP 5. ⚠️ sys.tables (is_ms_shipped=0) = 88 (S61 Budget-replace DROPPED 93→88 — narrative-93 STALE; when commit touches no schema, 88 correct, don't FAIL on 88. Always cross-ref COMMIT scope vs ambient count). enum-additive (e.g. ApprovalAttachment=5) = int-stored NO-mig → expect history NOT-advance + empty git diff -- '*Migrations*'.
  • Bundle hash live (S78 Run #330): admin js CsJetgZH/css Bvr5i5Nj · user js BVS0ApIm/css DHshp2tb. ⚠️ PER #69: hash ROTATES every deploy (non-deterministic) — this is a SNAPSHOT for THIS run, NOT a frozen baseline. Real FE ship = diff fe-*/src in commit/range + real-size-vs-fake (1.6/1.5MB vs ~919b SPA-trap), NOT hash delta. ASYMMETRIC-deploy framing (S66) RETIRED by #69.
  • Bearer: admin admin@solutions.com.vn/Admin@123456 (full) · UAT nv.test@solutions.com.vn/TestUser@123456 (Drafter CCM, gotcha #44 check).

🔑 Critical config (flag commit nếu tái xuất)

Node CI 20.x (feedback_node_cicd) · MediatR 12.4.1 (gotcha #1, flag Version="14) · Swashbuckle 6.9.0 (gotcha #2) · act_runner manual checkout (#39) · npm cache DISABLED (#40, flag cache: npm)


🎯 Per-NV admin opt-in wire — 10-point checklist (cumulative S22→S23)

Cross-ref feedback_per_nv_permission_scope. Per-NV/per-Level refactor MUST verify: 1 Domain field · 2 EF HasDefaultValue(false) · 3 Mig 3-file · 4 Service read · 5 Domain+App DTO mirror · 6 Designer FE checkbox · 7 AwLevelDto+ToDto · 8 CreateAwLevelInput+Update mutation · 9 Lookup discrimination (FirstOrDefault ADD ApproverUserId==actorId + admin fallback) · 10 Controller body record count == Command record count. Bug latency 2-3 days prod silent khi miss 9-10. Scan grep -n "FirstOrDefault.*Order.*==" *.cs after OR-of-N refactor.

📊 Run stats baseline

BE (test+build) ~90s · FE × 2 ~60s/app · deploy ~30s · total ~3min code / 0s docs-only. >5min → escalate. Recent runs all ~4m40-5m.


📅 Recent runs (compressed — full verbatim → archive/2026-06.md via archive/_INDEX.md)

  • S78 #330 7886fd0 PASS ~4m56s — cross-stack NO-MIG enum ApprovalAttachment=5 (attach-on-approve) + FE 2-app PeWorkflowPanel modal-upload. Verified: empty-migrations-diff + Mig-frozen-57 + tables88 + bundle BOTH rotate js+css (real 1.6/1.5MB vs fake-919b) + endpoint bodyless-POST-411→re-probe-body-401-wired + health 4×200. → _INDEX.
  • S76 #318 e33481e PASS ~4m58s — full-stack Mig56 ProInitial/ProAdjust VERIFIED-APPLIED-PROD (sys.columns decimal(18,2)) + 5th-axis BACKFILL-verify 4-rows-0-violation (gotcha #64 prod-data-UPDATE-first-time) + history-advance-55→56 + tables88 + bundle BOTH+css rotate + health 4×200. Pre-deploy DB snapshot (Mig55) → post (Mig56) = unambiguous proof. → _INDEX.
  • S83 #325 e29391e PASS ~4m39s — FE-only tiny budget-subitem indent/dash + bundle BOTH-js-rotate css-FROZEN (utility-reuse, no new css chunk) + Mig-frozen-57 + tables88 + test351. → _INDEX.
  • (S77 #329 e823694 FE-only banner Trả-lại · #328 424131d BE-only notify-block test-gate-KEY · #327 fa6654b FE-list-restructure · #326 b5aa72d BE-authz asymmetric · #323 #322 #321 #320 PE UX batch → all PASS, FIFO-trimmed, full verbatim git + archive _INDEX)
  • (S74-bis #315 8655ebf Mig55 CcmNote · S73 #313 1d86abc Mig54 5-cols + endpoint-401-not-404-probe · S69b #307 1f8947e Office-golive seed-16/16-DB-query-proof → archive _INDEX + git)
  • Older runs (S75 #301 ← S62 #286 ← … → S29 #232) full verbatim → git d2f52ba + archive/2026-06.md (incl #291 06-16 FAIL forensic [gotcha #65]); pre-S38 → archive/2026-05-{runs,q2,q3,q4}.md.

🔄 Curate trigger

  • ~30KB → archive recent runs → L2 archive/<period>.md (byte-exact append) + _INDEX.md substring pointer. Dup failure patterns → merge. Stale >3mo → remove.

  • Last curate: 2026-06-20 S80 (em-main, archive-gate keep-floor-hit → manual) (38.8→~16KB): moved 3 huge run-entries #330/#318/#325 (lines 73/90/96 byte-exact via sed) → archive/2026-06.md (100.6→116.7KB) + _INDEX.md +3 substring pointers (verified count=1). KEPT foundation (bug-patterns/5-stage/10-point gold) + compressed essentials + run-stubs; UPDATED stale (130→354 test Stage-3 + 263→354 essentials, Mig 55→57, bundle→#330) + dropped verbose per-run bundle-history (S86/S85/S84 → archived run-entries).
  • Prev curate: 2026-06-17 S70 Harness-9 (65.2→23.2KB dark-matter recovery, built _INDEX/gist gen:1). Prev S66 (86.8→29.2KB). Prev S40 (35.3→~21KB).