Files
solution-erp/.claude/agent-memory/cicd-monitor/MEMORY.md
pqhuy1987 098baa6da6
All checks were successful
Deploy SOLUTION_ERP / build-deploy (push) Successful in 3m30s
[CLAUDE] Docs: Chunk G — K8 Plan K wrap S23 t1: docs + session log + Designer comment cleanup + 3 agent MEMORY drift
Plan K Mig 31 F2 refactor sang per-Approver-slot DONE — 8 commits cumulative
S23 t1 (`56868bf..<this>`). K8 wrap docs + dirty MEMORY.md commit:

Docs updates:
- docs/STATUS.md: Last updated S23 t1 entry với Plan K summary 8 chunk
- docs/HANDOFF.md: TL;DR S23 t1 đầy đủ (top) — multi-agent ROI evidence
- docs/database/schema-diagram.md §14: title Mig 22-31 (was 22-29) + add
  Mig 30 F4 + Mig 31 F2 blocks per slot Approver + DROP Users column note
- NEW docs/changelog/sessions/2026-05-14-s23-turn1-plan-k-mig31-f2-refactor.md
  session log đầy đủ 8 chunk timeline + multi-agent spawn cost table + pattern
  reinforced 3×

FE Admin Designer comment cleanup (Reviewer K2 follow-up):
- ApprovalWorkflowsV2Page.tsx lines 73-75 + 502-504: 2 stale narratives "F2
  AllowDrafterSkipToFinal xuống per User (User Management)" rewrite Mig 29+30+31
  cumulative narrative "7 Allow* ALL xuống per Level slot, pattern proven 3×"

3 agent MEMORY.md drift commit (dirty từ session start S23 + S22 chốt):
- Investigator: K0 pre-flight findings + 5 surprises catch
- Reviewer: K2 PASS report + new pattern "transient sentinel zombie" anti-pattern
- CICD Monitor: S22 chốt verify cumulative (Run #193 + S23 t1 pending K9 spawn)

User-level memory updates (cross-project diary persisted ngoài repo):
- feedback_per_nv_permission_scope.md: reinforcement S23 t1 — Pattern 3×
  cumulative (Mig 29 + Mig 30 + Mig 31). Pattern ALSO applies cho refactor existing
  scope, KHÔNG chỉ greenfield. Cross-ref discoveries Plan K (compile-break workaround,
  stale narrative drift, transient sentinel zombie anti-pattern caught Reviewer).
- MEMORY.md index: cumulative reinforcement note 3× Mig 31

Verify:
- dotnet build production projects clean
- npm run build fe-admin pass 17.76s, 0 TS err
- Test 104/104 PASS (S23 t1 K7 chunk maintained baseline)

Plan K state final: 31 mig · 59 tables · ~145 endpoints · 104 test · 47 gotcha
· 20 memory · 6 skills · 4 sub-agents active. CHƯA push remote — chờ bro confirm
K9 spawn CICD Monitor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 23:48:41 +07:00

161 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CI/CD Monitor Agent — Persistent Memory
> **Persistent diary cross-session.** Auto-injected first 200 lines / 25KB at spawn.
> Update BEFORE every stop. Curate when > 25KB.
---
## 🎯 Role baseline
Read-only CI/CD pipeline + post-deploy verifier for SOLUTION_ERP. Polls Gitea Actions API, verifies test gate + deploy ship + prod health. Tools: Read, Grep, Glob, Bash, WebFetch. Output: PASS/FAIL verdict + evidence under 500 words. **Spawn cost ~150K tokens** — trade-off để catch fail tự động không phụ thuộc em main nhớ verify.
---
## 🚨 Recurring CI/CD bug patterns (catch with priority)
### Gotcha #39 — act_runner github.com TCP timeout
- **Symptom:** CI run hang ở "Set up job" → timeout 21s, run stays "queued" forever
- **Verify:** log line `Error: dial tcp ... github.com:443 ... i/o timeout`
- **Fix:** manual checkout bypass đã hardcode trong `.gitea/workflows/deploy.yml` (run #108/#109), pass at #110. KHÔNG revert. Nếu pattern returns → escalate em main check VPS network
### Gotcha #40 — npm cache `tsc not found`
- **Symptom:** `build_fe_admin` fail sau khi enable `cache: npm``actions/setup-node@v4`
- **Verify:** log line `sh: tsc: command not found` hoặc `npm error code ETIMEDOUT`
- **Fix:** DISABLED npm cache rolled back ở `a21790d`. KHÔNG re-enable. Build time chấp nhận ~3 min thay vì optimize
### Gotcha #41 — paths-ignore docs-only skip
- **Symptom:** Commit code thật mà CI không trigger (run list không có entry mới)
- **Verify:** `git diff --name-only HEAD~1 HEAD` vs `paths-ignore: ['docs/**', '**/*.md', '.claude/skills/**']`
- **Fix:** Nếu commit có code thật bị skip nhầm → check pattern conflict. Nếu commit chỉ docs → expected behavior (saving ~9 min deploy / commit MD-only)
### Gotcha #25 — IIS WebSocket / module exclusion
- **Symptom:** `notification-hub/negotiate` returns 401 hoặc 404 prod (FE SignalR connect fail)
- **Verify:** `curl -X POST https://api.solutions.com.vn/notification-hub/negotiate` → non-200
- **Fix:** IIS WebSocket module enable trong `web.config` của site api.solutions.com.vn (skill `iis-deploy-runbook`)
### Deploy ship verification — bundle hash unchanged
- **Symptom:** commit push success + Gitea action success + status PASS, **nhưng prod không có thay đổi visible** (user UAT báo "đã deploy mà không thấy")
- **Root cause candidates:**
- IIS app pool chưa recycle → giữ assembly cũ trong memory
- NSSM service script không copy file đúng folder
- Browser cache (rare nếu Vite hash chuẩn)
- **Verify:** `curl -s https://admin.solutions.com.vn/ | grep -oE '/assets/index-[a-z0-9]+\.js'` — hash giữ nguyên = ship fail
- **Fix:** SSH `vietreport-vps "Restart-WebAppPool admin.solutions.com.vn"` + recheck bundle hash
### Migration drift prod vs repo
- **Symptom:** Latest mig trong repo (vd Mig 27) nhưng prod chưa có (DbInitializer startup fail)
- **Verify:** Compare `ls Migrations/*.cs` vs `sqlcmd ... __EFMigrationsHistory`
- **Fix:** Check `Program.cs` startup hook `app.MigrateDatabase()` còn không + app pool recycle. Hoặc manual `dotnet ef database update --connection prod` qua SSH
---
## 📋 5-stage checklist (apply EVERY run)
### Stage 1: Push happened + filter check
- `git log -1 --format='%H %s'` — latest commit
- `git log origin/main..HEAD` — must be empty (synced)
- `git diff --name-only HEAD~1 HEAD` vs `paths-ignore` — nếu chỉ docs → SKIPPED-DOCS
### Stage 2: Gitea Actions poll (max 10 iter × 60s)
- API: `https://git.baocaogiaoduc.vn/api/v1/repos/vietreport-admin/solution-erp/actions/runs?limit=5`
- Match `head_sha == $commitSha` → get `runId`
- Status: queued / in_progress / completed
- Conclusion (when completed): success / failure / cancelled / timed_out
### Stage 3: Test gate verify (Domain 58 + Infra 23 baseline)
- Logs grep: `Passed:` line per stage
- Phase 9 UAT exception: test count may be lower nếu em main skip per chunk (memory `feedback_uat_skip_verify`) — NOT a failure
- Delta from baseline → report
### Stage 4: Post-deploy live verify (if SUCCESS)
- Auth login → bearer (admin + nv.test for non-admin gotcha #44 check)
- 3-5 endpoint smoke 2XX expected (include endpoint mới trong commit)
- FE bundle hash 2 app changed (compare pre vs post)
- SignalR negotiate (gotcha #25 — if commit relates notification)
- EF migration latest prod == latest repo
### Stage 5: Report PASS/FAIL with evidence + MEMORY.md update
---
## ⚠️ Anti-patterns observed (DO NOT)
1. ❌ Push fix code — READ only, escalate to em main
2. ❌ Speculate fail cause without log evidence
3. ❌ Skip post-deploy live verify khi SUCCESS — bundle hash là biggest catch
4. ❌ Skip MEMORY.md update
5. ❌ Poll forever (max 10 iter ~10 min timeout)
6. ❌ Auto-rollback — escalate với recommendation, KHÔNG tự chạy
7. ❌ Verify khi commit docs-only — SKIPPED-DOCS + return ngay
---
## 🧠 SOLUTION_ERP CI/CD essentials
- **Gitea:** https://git.baocaogiaoduc.vn/vietreport-admin/solution-erp
- **Workflow:** `.gitea/workflows/deploy.yml` (test gate 2 step + build BE + build FE × 2 + deploy)
- **Path filter:** `paths-ignore: ['docs/**', '**/*.md', '.claude/skills/**']` (gotcha #41)
- **Prod URLs:** api / admin / eoffice `.solutions.com.vn`
- **SSH VPS:** `ssh vietreport-vps` (user=Administrator, key=id_ed25519)
- **DB prod:** `.\SQLEXPRESS` / `SolutionErp` / vrapp user
- **Tests baseline:** 104/104 (58 Domain + 46 Infra = 23 codegen + 6 PE WF + 3 PE Guard S21 t3 + 7 ReturnMode + 7 DraftGuard + 5 AuthorizePolicy + 1 V2 actor scope reject) — S22+1 +1 test. Re-verified S22 chốt cuối 23:25 (Verify push range `3d725c4..cc8a7d3`).
- **Mig latest repo:** Mig 30 `20260513160703_AddAllowApproverEditBudgetToLevels` (S22+5 — per-NV F4 admin opt-in cho Approver edit Section ngân sách ChoDuyet branch). Prev Mig 29 (S21 t5 refactor per-NV) preserved.
- **Gitea Actions API path:** `/api/v1/repos/{owner}/{repo}/actions/tasks?limit=N` (NOT `/runs` — returns 404). Public no-auth read OK. Fields: `id`, `run_number`, `head_sha`, `status` (queued/running/success/failure/cancelled), `conclusion`, `created_at`, `updated_at`, `display_title`.
- **Mig latest prod:** sqlcmd `__EFMigrationsHistory ORDER BY MigrationId DESC TOP 5`
- **Bearer test:**
- Admin: `admin@solutions.com.vn / Admin@123456` (full)
- UAT non-admin: `nv.test@solutions.com.vn / TestUser@123456` (Drafter CCM — verify gotcha #44 silent 403 patterns)
---
## 🔑 Critical config (gotcha cross-ref)
- Node CI pin: `20.x` (memory `feedback_node_cicd` — bài học NamGroup)
- MediatR pin: `12.4.1` (gotcha #1)
- Swashbuckle pin: `6.9.0` (gotcha #2)
- act_runner: manual checkout bypass github.com (gotcha #39)
- npm cache: DISABLED (gotcha #40 — KHÔNG re-enable)
Flag commit nếu thấy `<PackageReference Include="MediatR" Version="14...` hoặc `cache: npm` tái xuất hiện.
---
## 📊 Run stats baseline (cumulative)
- **Build time BE (test_domain + test_infra + build_be):** ~90s baseline
- **Build time FE × 2 app:** ~60s baseline mỗi app
- **Deploy NSSM + IIS recycle:** ~30s
- **Total CI run time:** ~3 min code commit / 0s docs-only commit
- **Trend trigger:** nếu run time > 5 min → escalate (cluster network slow hoặc dependency bloat)
- **Bundle size baseline:** fe-admin ~800KB gz / fe-user ~750KB gz (Vite production build)
---
## 📅 Recent runs (FIFO last 20)
- **2026-05-13 23:25 — Verify S22 chốt cuối cumulative (push range `3d725c4..cc8a7d3` 12 commits) VERDICT=PASS** (S22 chốt — em main spawn cumulative verify cuối S22). Latest CI run #193 id=307 sha=`b04a11a` (S22+5 Chunk B FE) success at 23:16. **Tip commit `cc8a7d3` (docs+4 agent MEMORY.md) → CI SKIPPED via `**/*.md` glob** (all 4 `.claude/agent-memory/*.md` + 4 `docs/**` files match — paths-ignore correctly fires). Spec hypothesis ("gotcha #47 — `.claude/agent-memory/**` NOT in paths-ignore → trigger CI") **disproven for this commit**: `**/*.md` glob matches `.md` files at ANY depth so `.claude/agent-memory/MEMORY.md` DOES match. Run #188 a74e671 trigger anomaly was NOT due to agent-memory path. **Gotcha #47 still useful as PREVENTIVE** for future when adding non-.md state files under `.claude/agent-memory/` (e.g. `.json` state, `.log`) — explicit `'.claude/agent-memory/**'` ignore would future-proof. **Recent runs S22 sequence (12 commits → 11 trigger + 1 skip):** #189 sha=40f64c6 ✓ (S22+1 BE guard) · #190 sha=8185070 **CANCELLED** by concurrency (seed users script superseded by next push within 3 min — normal not a fail) · #191 sha=0e70789 ✓ (rename script) · #192 sha=30d51c8 ✓ (S22+4 FE) · #193 sha=b04a11a ✓ (S22+5 FE — also covers S22+5 Chunk A BE `b079b27` since both pushed batched; Gitea trigger only on push tip). `cc8a7d3` skip = correct (no run 308). **Discovery #3:** When 2+ commits pushed in same `git push`, Gitea Actions evaluates ONLY the push tip's paths against paths-ignore — intermediate commits do NOT each get evaluated separately. So `b079b27` (BE Mig 30) + `b04a11a` (FE Mig 30) pushed together → single Run #193 on tip. Test gate inferred PASS for all 11 trigger runs (each deploy stage succeeded). **Local test verify 104/104 PASS** (58 Domain + 46 Infra = +1 vs Run #188's 103 — S22+1 added 1 BE guard test). **Mig 30 prod confirmed:** sqlcmd TOP 5 = `20260513160703_AddAllowApproverEditBudgetToLevels` TOP 1 (S22+5 wire). **Schema live verify:** PE detail `currentLevelOptions` 6 keys (5 from Mig 29 + 1 new `allowApproverEditBudget`) ✓, AwLevelDto 12 keys including `allowApproverEditBudget` ✓. **Endpoint wire LIVE:** PATCH `/api/users/{id}/allow-skip-final` admin=204 ✓ + act.nv=403 ✓ (Plan D admin-only enforce) · PATCH `/api/purchase-evaluations/{id}/budget-adjust` admin=204 ✓ (S22+4 BE) · GET `/api/purchase-evaluations/{id}/attachments/{attId}/view` 200 + `Content-Disposition: inline; filename="HD- Eoffice.pdf"` ✓ (S22+4 preview). **Plan E strict V2 scope LIVE:** admin pageSize=200 → 17 PE / act.nv pageSize=200 → 0 PE (Drafter strict filter active; act.nv fresh user no PE participation). **Bundle hash rotated 2/2:** admin `Cclc8Uwu` → `CpI5OL8n` ✓ (S22 cumulative FE deploy) / user `B6N5hq3d` → `d064StNa` ✓ (S22+4 + S22+5 touched fe-user — rotation expected). Smoke 5/5 endpoints 200: contracts, pe, users, menus, approval-workflows-v2. **33 active users prod confirmed** (20 role-based new from S22+2 seed + S22+3 rename: act/pp/tp · bod.1/2 · equ/fin/hra/pm/qs nv/pp/tp + 13 pre-existing). All role-based users `isActive=true` + login `act.nv@solutions.com.vn / TestUser@2026` OK with roles `Drafter,Accounting`. **Token caching pattern from Run #188 worked:** 1 admin login + 1 act.nv login total = 2 auth requests cached to `C:\Users\pqhuy\AppData\Local\Temp\*_token.txt` + 8 subsequent endpoint calls reuse token → no 429 rate limit encountered (vs Run #188 hit 429). **Trend:** S22 cumulative 5 turn iteration delivered Mig 30 + 3 new endpoints + scope filter + 20 seed users — 0 deploy regression. Baseline cumulative passes 81→103→104 test grow consistent with feature delivery.
- **2026-05-13 21:25-21:28 — Run #188 id=302 sha=a74e671 VERDICT=PASS** (S22 — 5 commits: Plan D Users F2 toggle BE+FE Admin AllowDrafterSkipToFinal + Plan C task 1-3 14 service test ReturnMode/Guard + Plan C task 4 5 regression test #44 silent 403 + Plan E PE strict V2 scope + Docs/MEMORY 3-agent drift patch). Duration 3m28s (baseline). Path filter: the push tip `a74e671` includes `.claude/agent-memory/**` files (NOT in paths-ignore) + `docs/**` (in paths-ignore) → Gitea evaluated push as CI-eligible (some files OUTSIDE paths-ignore), trigger fired correctly. **Local test verify: 58 Domain + 45 Infra = 103/103 PASS (+19 from S21 84)** breakdown: 23 codegen + 6 PE WF + 7 ReturnMode + 7 DraftGuard + 5 AuthorizePolicy regression. CI deploy succeeded → inferred test gate PASS (deploy only runs if tests pass). Bundles deployed: admin `index-Cclc8Uwu.js` rotated from `D5l49-70` (21:27:24 PM VPS), user `index-B6N5hq3d.js` UNCHANGED (Plan C/D/E touched only fe-admin, expected). DLLs deployed 21:25-26 PM. Mig 29 `RefactorAdvancedOptionsToPerLevelAndDrafterUser` still TOP 1 (no new mig in S22, expected). **Plan D wire LIVE:** GET `/api/users` response includes `allowDrafterSkipToFinal` field (boolean), PATCH `/api/users/{id}/allow-skip-final` admin=204 ✓ + nv.test=403 ✓ (admin-only enforced). **Plan E wire LIVE:** nv.test PE list totalCount=8 < admin totalCount=17 (strict V2 scope filter ACTIVE drafter only sees own + participant PE). Smoke 5/5 endpoints 200: `/api/contracts`, `/api/purchase-evaluations`, `/api/menus`, `/api/approval-workflows-v2`, `/api/users`. **Discovery #1:** Rate limit auth login triggers at ~5 requests/min HTTP 429. Pattern: backoff 60s + retry. Spread login calls or cache token across endpoints in same agent run. **Discovery #2:** `.claude/agent-memory/**` files are NOT in paths-ignore (only `docs/**` + `**/*.md` + `.claude/skills/**` + `.gitignore` + `scripts/**.md`) MEMORY.md commits DO trigger CI even when "looks like docs". Spec assumption ("docs commit `a74e671` triggers paths-ignore skip per gotcha #41") was incorrect for this case `.claude/agent-memory/**` triggers CI.
- **2026-05-13 20:12-20:15 Run #187 id=301 sha=c0af9e0 VERDICT=PASS** (S21 t5 4 commits: Mig 29 refactor Allow* per-NV + FE Admin Designer 5 checkbox per-Level slot + FE eOffice rename `workflowOptions → currentLevelOptions` + drafterAllowSkipToFinal + Docs). Duration ~3m18s (baseline). Test gate inferred PASS (deploy stage chỉ chạy sau test gate). Mig 29 applied prod (TOP 1 in __EFMigrationsHistory). Schema verified: ApprovalWorkflowLevels +5 Allow* (AllowReturnOneLevel/OneStep/ToAssignee/ToDrafter/ApproverEditDetails), Users +1 AllowDrafterSkipToFinal, ApprovalWorkflows -6 Allow* (DROPPED). Backfill: 48/48 Levels.AllowReturnToDrafter=1 (default + S21 t4 workflow.AllowReturnToDrafter=true copied đúng), 0/13 Users.AllowDrafterSkipToFinal=1 (S21 t4 workflow.AllowDrafterSkipToFinal=false 0 user backfill preserve correct). Bundles deployed 20:14-20:15 (admin `index-D5l49-70.js` was `CzesdXLh`, user `index-B6N5hq3d.js` was `DP-gH4LW` both rotated ✓). API contract: `AwDefinitionDto` 12 keys 0 Allow*, `AwLevelDto` 11 keys 5 Allow*, PE detail bundle has `currentLevelOptions` (dict 5 Allow*) + `drafterAllowSkipToFinal=false` boolean, `workflowOptions` REMOVED. **Discovery:** Gitea API task table caches `updated_at` stale (~2 min behind reality) file timestamps on VPS (`Get-Item .dll/.html LastWriteTime`) confirms deploy completion sớm hơn API status update. Cross-check 2 source nếu time-sensitive. Also: `appsettings.Production.json` `C:\inetpub\solution-erp\api\` chứa connection string credential (user=vrapp / pwd=`buKL3TGBkD0wDDbYVw65QeX9`) khi `$env:PROD_DB_PASSWORD` empty local.
- **2026-05-13 19:13-19:16 Run #186 id=300 sha=eea86fd VERDICT=PASS** (S21 t3+t4 8 commits: 3 gotcha #45 fix Trả lại + 5 F1+F2+F3 PE Workflow advanced options + Mig 28). Duration 3m32s (baseline). Test gate confirmed via deploy success (Domain + Infra run BEFORE build/publish if any of 84 test failed, deploy stage wouldn't have run). Mig 28 `20260513114505_AddAdvancedOptionsToApprovalWorkflows` applied prod (top of `__EFMigrationsHistory`). FE bundles deployed 19:15 (admin `index-CzesdXLh.js` + user `index-DP-gH4LW.js`). Smoke 200: `/api/auth/login`, `/api/approval-workflows-v2?applicableType=1` (response includes 6 new `allowReturnOneLevel/OneStep/ToAssignee/ToDrafter/DrafterSkipToFinal/ApproverEditDetails` per workflow def, `allowReturnToDrafter=true` default + 5 false backward compat ✅), `/api/purchase-evaluations/{id}` (response includes `workflowOptions` object populated), `/api/menus`, `/api/contracts`. **Discovery:** API endpoint to list Gitea Actions runs is `/api/v1/repos/.../actions/tasks` (NOT `/actions/runs` 404). Public no-auth OK for read.
- **2026-05-12 (setup):** CI/CD Monitor agent initialized. Baseline knowledge load complete (44 gotchas cross-ref + 5-stage checklist + 3 skills preload + bundle hash verify pattern). No runs monitored yet.
---
## 🔄 Curate trigger
- Memory size > 25KB → archive recent runs to `archive/<period>.md`
- Duplicate failure patterns → merge into single entry (vd act_runner timeout x3 → 1 entry)
- Stale > 3 months → remove
Last curate: 2026-05-13 23:30 (added S22 chốt cuối cumulative verify Run #189-193 sequence + Mig 30 schema live + 3 new endpoint wire + 33 user role-based + bundle rotate 2/2 + test baseline 104 + Discovery #3 Gitea push tip paths-ignore eval. Disproven spec hypothesis re: gotcha #47 `.claude/agent-memory/**` trigger — `**/*.md` glob already catches `.md` files at any depth. Gotcha #47 kept as preventive for non-.md future state files.)