- Session log S42-S43 (P11-A WorkflowApps ApproveV2 + P11-B LeaveBalance, 8 commit chain) - HANDOFF tiering: +S43 +S42, trim S40-S38 → session logs - gotcha #56 CWD-drift stray memory (cd trước spawn → agent ghi nhầm fe-user/.claude, 3× S42-S43) - STATUS gotchas 55→56 - cicd-monitor MEMORY (Run #367 P11-B verdict) User memory: +feedback_high_to_max_multiagent_quality (High lọt 2 bug, Max 0 bug; WIRE FE đọc reference proven + FK-invariant-at-write-doors + Max re-review cross-stack). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
84 lines
14 KiB
Markdown
84 lines
14 KiB
Markdown
# CI/CD Monitor Agent — Persistent Memory
|
||
|
||
> **Persistent diary cross-session.** Auto-injected first 200 lines / 25KB at spawn.
|
||
> Update BEFORE every stop. Tiered Memory v1: L1 HOT soft-cap ~30KB · L2 `archive/` on-demand · L3 RAG `search_memory` just-in-time. Keep entry ≤ 1.5K chars (gotcha #53).
|
||
> Full verbatim run history pre-S40 → git `d2f52ba` + `archive/2026-05-{runs,q2,q3,q4}.md`.
|
||
|
||
---
|
||
|
||
## 🎯 Role baseline
|
||
|
||
Read-only CI/CD + post-deploy verifier SOLUTION_ERP. Polls Gitea Actions API, verifies test gate + deploy ship + prod health. Tools: Read, Grep, Glob, Bash, WebFetch + 5 RAG MCP. Output: PASS/FAIL + evidence <500 words. Skills: `iis-deploy-runbook` + `dependency-audit-erp` + `ef-core-migration`. Spawn ~150K — trade-off catch fail tự động.
|
||
|
||
---
|
||
|
||
## 🚨 Recurring CI/CD bug patterns (catch priority)
|
||
|
||
- **#39 act_runner github.com TCP timeout** — run hang "Set up job" 21s. Log `dial tcp github.com:443 i/o timeout`. Fix: manual checkout bypass hardcoded `.gitea/workflows/deploy.yml` (pass #110). KHÔNG revert.
|
||
- **#40 npm cache `tsc not found`** — `build_fe_admin` fail post `cache: npm`. DISABLED rolled back `a21790d`. KHÔNG re-enable.
|
||
- **#41 paths-ignore docs-only skip** — code commit không trigger CI? Check `git diff --name-only HEAD~1 HEAD` vs `paths-ignore: ['docs/**','**/*.md','.claude/skills/**']`. Discovery #3: Gitea evaluates push *range* commits — nếu ≥1 commit có non-ignored file → toàn range build (BENEFICIAL).
|
||
- **#25 IIS WebSocket** — `notification-hub/negotiate` 401/404 prod. Fix: WebSocket module enable `web.config` site api (skill `iis-deploy-runbook`).
|
||
- **#48 SQLite tie-break** — `OrderByDescending(CreatedAt).First()` pick wrong khi 2+ `.Add()` cùng frozen-clock. Fix: discriminator filter `.Where(Summary.Contains("Chuyển phase"))` BEFORE OrderBy.
|
||
- **Bundle hash unchanged = ship FAIL** — push+action success nhưng prod không đổi. Verify `curl -s https://admin.solutions.com.vn/ | grep -oE '/assets/index-[a-z0-9]+\.js'`. Fix: SSH `Restart-WebAppPool`. ⚠️ Bundle hash verify MUST sau status=success (Run #242 false-positive lesson: check khi "running" → stale hash).
|
||
- **Migration drift prod vs repo** — compare `ls .../Persistence/Migrations/*.cs` vs `sqlcmd __EFMigrationsHistory`. Fix: check `Program.cs` `app.MigrateDatabase()` + app pool recycle.
|
||
|
||
---
|
||
|
||
## 📋 5-stage checklist (EVERY run)
|
||
|
||
- **Stage 0 RAG infra:** `Get-Service Qdrant` Running + `http://localhost:6333/healthz`. Collection `proj_solution_erp` (prefix `proj_*` 7 project — Discovery #8).
|
||
- **Stage 1 Push+filter:** `git log -1 --format='%H %s'` + `git log origin/main..HEAD` empty + diff vs paths-ignore (docs-only → SKIPPED-DOCS return).
|
||
- **Stage 2 Gitea poll** (max 10 iter × 60s): API `.../actions/tasks?limit=5` (NOT `/runs` 404). Match `head_sha`. ⚠️ task table `updated_at` stale ~2min (gotcha #46) → cross-check VPS mtime.
|
||
- **Stage 3 Test gate:** baseline **130 PASS** (58 Domain + 72 Infra). Phase 9 UAT exception lower OK (`feedback_uat_skip_verify`).
|
||
- **Stage 4 Post-deploy** (if SUCCESS): auth login bearer (admin + nv.test gotcha #44; token=`accessToken` route `/api/auth/login`) → 3-5 endpoint smoke 2XX (incl new) → FE bundle hash 2 app changed → SignalR negotiate (gotcha #25 if relevant) → EF mig prod==repo.
|
||
- **Stage 4.6 (S29 CRITICAL):** sqlcmd seed sample verify post-deploy (NOT chỉ schema). `sqlcmd -Q "SELECT Code FROM ApprovalWorkflows WHERE Code LIKE 'QT-%-V2-%'"` → 0 rows = seed GATE BLOCKED → gotcha #51.
|
||
- Discovery #4: ASP.NET 10 record enum cần numeric input unless `JsonStringEnumConverter` (SOL has NO converter → FE sends numeric). #5: sqlcmd ssh Windows-auth cần `\\\\SQLEXPRESS` 4-backslash. #6: INFRASTRUCTURE seed (Roles/Depts/Catalogs/MenuTree/AdminPerms/Templates/SampleWorkflowsV2) MUST run, NOT inside `if(!demoSeedDisabled)`; DEMO seed (DemoUsers/Contracts/PE) OK gated → gotcha #51.
|
||
- **Stage 5 Report** PASS/FAIL + evidence + MEMORY update.
|
||
|
||
---
|
||
|
||
## ⚠️ Anti-patterns (DO NOT)
|
||
1. ❌ Push fix code — READ only, escalate em main · 2. ❌ Speculate fail without log · 3. ❌ Skip post-deploy bundle hash (biggest catch) · 4. ❌ Skip MEMORY · 5. ❌ Poll forever (max 10 iter) · 6. ❌ Auto-rollback (escalate + recommend) · 7. ❌ Verify docs-only (SKIPPED-DOCS return ngay)
|
||
|
||
---
|
||
|
||
## 🧠 SOLUTION_ERP CI/CD essentials (S40 verified)
|
||
|
||
- **Gitea:** `git.baocaogiaoduc.vn/vietreport-admin/solution-erp` · workflow `.gitea/workflows/deploy.yml` · paths-ignore `['docs/**','**/*.md','.claude/skills/**']`
|
||
- **Prod:** api/admin/eoffice `.solutions.com.vn` · SSH `ssh vietreport-vps` (Administrator, id_ed25519) · IIS site phys paths (S42 verified): API `C:\inetpub\solution-erp\api` · admin `\fe-admin` · user `\fe-user` (3 sites Started). DB `.\SQLEXPRESS`/`SolutionErp`/`vrapp` SQL-auth. **Conn string key = `ConnectionStrings.Default` (NOT `DefaultConnection`!)** — read pw from prod appsettings.Production.json when `$env:PROD_DB_PASSWORD` empty.
|
||
- **SSH→PS quoting (S42 lesson):** nested bash→ssh→powershell mangles `$var`/`\"`. Use `iconv UTF-16LE | base64` → `powershell -EncodedCommand $B64`. Single-quote literal paths.
|
||
- **Tests baseline:** **154 PASS** (S42 P11-B Run #367 sha 82d7fcf; +11 LeaveBalance +13 P11-A vs prev 130). CI gate runs both test projects BEFORE build/deploy → status=success ⟹ test gate passed (`tasks` endpoint reports terminal as `status:success`, `conclusion` field NOT populated). Local grep undercounts (Theory/InlineData) — trust CI conclusion. Phase 9 UAT mode skip per chunk OK.
|
||
- **Mig latest repo:** **Mig 42 `20260530034336_AddLeaveBalances`** (S42 P11-B; prod tables 91). Path `src/Backend/SolutionErp.Infrastructure/Persistence/Migrations/`. Prod check `sqlcmd __EFMigrationsHistory ORDER BY MigrationId DESC TOP 5`.
|
||
- **Bearer:** admin `admin@solutions.com.vn/Admin@123456` (full) · UAT `nv.test@solutions.com.vn/TestUser@123456` (Drafter CCM, gotcha #44 check)
|
||
- **Bundle hash live S42:** admin `Krjvg_3j` · user `6sNStgxa` (Run #367 sha 82d7fcf). Prev admin `BU8FTBRi` · user `tepE4jvR` (#366/ffb2062). Bundle size ~800KB/750KB gz.
|
||
- **DB pw (S42, when `$PROD_DB_PASSWORD` empty):** `vrapp/buKL3TGBkD0wDDbYVw65QeX9` read from `C:\inetpub\solution-erp\api\appsettings.Production.json`→`ConnectionStrings.Default`. ⚠️ Skill-doc path `C:\inetpub\apps\SolutionErp\Api` is STALE → real path `C:\inetpub\solution-erp\api`. sqlcmd over SSH works direct (no UTF-16 encode needed). ⚠️ sys-catalog string-concat queries hit collation conflict (`Latin1_General_CI_AS_KS_WS` vs `SQL_Latin1_General_CP1_CI_AS`) → add `COLLATE DATABASE_DEFAULT` per concatenated column.
|
||
|
||
## 🔑 Critical config (flag commit nếu tái xuất)
|
||
Node CI `20.x` (`feedback_node_cicd`) · MediatR `12.4.1` (gotcha #1, flag `Version="14`) · Swashbuckle `6.9.0` (gotcha #2) · act_runner manual checkout (#39) · npm cache DISABLED (#40, flag `cache: npm`)
|
||
|
||
---
|
||
|
||
## 🎯 Per-NV admin opt-in wire — 10-point checklist (cumulative S22→S23)
|
||
Cross-ref `feedback_per_nv_permission_scope`. Per-NV/per-Level refactor MUST verify: 1 Domain field · 2 EF `HasDefaultValue(false)` · 3 Mig 3-file · 4 Service read · 5 Domain+App DTO mirror · 6 Designer FE checkbox · 7 AwLevelDto+ToDto · 8 CreateAwLevelInput+Update mutation · 9 **Lookup discrimination** (`FirstOrDefault` ADD `ApproverUserId==actorId` + admin fallback) · 10 **Controller body record count == Command record count**. Bug latency 2-3 days prod silent khi miss 9-10. Scan `grep -n "FirstOrDefault.*Order.*==" *.cs` after OR-of-N refactor.
|
||
|
||
## 📊 Run stats baseline
|
||
BE (test+build) ~90s · FE × 2 ~60s/app · deploy ~30s · **total ~3min code / 0s docs-only**. >5min → escalate.
|
||
|
||
---
|
||
|
||
## 📅 Recent runs (FIFO — older → archive/git)
|
||
|
||
- **2026-05-30 Run #367 (run_number 253) sha=`82d7fcf` PASS ~4m08s (S42 P11-B LeaveBalance business logic, Mig 42):** Code commit 22 files (4 BE: Domain `LeaveBalance.cs` + App `LeaveBalanceFeatures.cs`/`LeaveOtApprovalFeatures` deduction hook + `LeaveBalancesController` + IApplicationDbContext + DbContext + Config + Mig42 3-file + 2 FE `WorkflowAppDetailPage`×2 +`workflowApps.ts`×2 + 2 tests + 4 agent-memory .md). Started 11:11:40 → success iter4 11:15:48. **Bundle rotate admin `BU8FTBRi→Krjvg_3j` + user `tepE4jvR→6sNStgxa`** (both changed ✓ FE shipped, verified AFTER status=success — pre-deploy snapshot still showed old hash, correct timing). **Mig 42 `20260530034336_AddLeaveBalances` auto-applied prod** (tables 90→**91**, `LeaveBalances` EXISTS). Schema ✓: UserId/LeaveTypeId/Year/EntitledDays/UsedDays/AdjustmentDays decimal + AuditableEntity soft-delete. **UNIQUE `IX_LeaveBalances_UserId_LeaveTypeId_Year`** + **FK→LeaveTypes del=NO_ACTION** (=Restrict) ✓. New endpoint smoke: `GET /api/leave-balances/my` unauth=**401** (route live not 404) + admin auth=**200** lazy-default 5 LeaveTypes (ANNUAL12/COMPASSIONATE3/MATERNITY180/SICK30/UNPAID0, all Used=0, `remainingDays`=entitled ✓ DTO shape has remainingDays/entitledDays) + `?year=2026` admin route 401 unauth + `PUT /adjust`=411 (route reg). health live/ready 200 Healthy. **NO seed gate concern** (plain table, lazy DTO — Stage 4.6 N/A). 0 regression. Note: prev run #366 (ffb2062 docs STATUS update) was a CODE-path push w/ status=success — NOT docs-only-skipped (commit touched only .md but Gitea still ran since prior range?); actually #366 display_title is Docs but ran full → confirms agent-memory .md NOT in paths-ignore (`.claude/skills/**` ignored, `.claude/agent-memory/**` NOT). Tag `[s42, run367, pass, p11b-leavebalance, mig42]`.
|
||
- **2026-05-30 Run #365 sha=`75df04e` PASS ~4m05s (S42 P11-A fix workflow picker 2-bug + SetWorkflow endpoint, NO migration):** Code commit 11 files (4 BE controllers + 2 App features `LeaveOtApprovalFeatures`/`TravelVehicleApprovalFeatures` +125 lines + 2 FE `WorkflowAppDetailPage` ×2 + 1 test +79 lines). Status=success iter5 (started 10:15:45). **Bundle rotate admin `BLA09-qv→6D4k-aRi` + user `CXvejOE-→DkME-974`** (both changed ✓ FE fix shipped, verified AFTER status=success). +4 endpoint `PUT /api/{leave,ot,travel,vehicle-bookings}/{id}/workflow` (`Set{Module}WorkflowCommand`, route `[HttpPut("{id:guid}/workflow")]` body record `SetWorkflowBody(Guid ApprovalWorkflowId)`). Unauth smoke leave+ot/workflow → **401** (route exists, NOT 404 ✓). health live+ready 200 Healthy. Test gate **144** (CI both proj pre-deploy; grep undercounts InlineData=14 Fact at WorkflowAppApproveV2Tests). **NO migration** → skipped Stage 4.6 seed (verified #250). **NAMING RECONCILE:** Gitea task IDs are real #364 (e7b66cd, mem-labeled "#250") + #365 (this). Going forward use actual Gitea task id. **HEADS-UP em main:** follow-up commit `e47ef1d` (FE-User ProposalCreatePage workflow dropdown shape, latent S37 bug) pushed 10:19:17 DURING poll — NOT yet triggered CI run, will redeploy FE shortly (bundle may re-rotate). Out of scope this verdict. Tag `[s42, run365, pass, p11a-setworkflow]`.
|
||
- **2026-05-30 Run #364 (mem #250) sha=`e7b66cd` PASS ~4m07s (S42 P11-A wire ApproveV2+LevelOpinions 4 WorkflowApps):** 1 commit BE+FE×2+Mig41+Tests. Status=success iter3. Bundle rotate admin `cWAXid0q→BLA09-qv` + user `CX79e2kZ→CXvejOE-`. **Mig 41 auto-applied prod** (latest=`20260530021936_WireWorkflowAppsApprovalV2`). Tables 84→**90** (+5: Leave/Ot/Travel/VehicleRequest LevelOpinions + WorkflowAppCodeSequences — ALL EXIST). 4 new endpoint smoke 200 auth (leave/ot/travel/vehicle-requests) + unauth 401 (route exists) + POST .../approve=411 (route reg). health live/ready 200. **Stage 4.6 seed gate PASS** (gotcha #51): 4 WF seeded prod despite DemoSeed:Disabled — QT-NP/OT/CT/XE-V2-001 AppType=5/6/7/9, verified call-site L142-145 OUTSIDE `if(!demoSeedDisabled)` gate. Test gate 141 (CI runs both proj pre-deploy). Note: table count 90 vs spec-expected 89 = baseline-count diff, NOT missing table (all 5 present). Stale doc drift deploy.yml comments "54/17 test" (cosmetic, flag em main). Tag `[s42, run250, pass, p11a-approvev2-workflowapps]`.
|
||
- **2026-05-28 Run #247 sha=`e54a22d` PASS 3m25s (S38 SKELETON 5-plan combo Mig 39+40 dual):** Push 1 commit mega `Domain+App+Infra+Api+FE×2`. ALL PASS. Bundle rotate admin `CGueDk22→cWAXid0q` + user `CEt0QRgX→CX79e2kZ`. Mig 39+40 dual auto-applied startup (90830→90839). 6 endpoint smoke 200 (leave/ot/travel/vehicle/it-tickets/hr-dashboard `totalEmployees=33 male=17 female=16`). 6 new tables + 8 menu seeded. 0 regression. Fastest S38 deploy. Tag `[s38, run247, pass, skeleton-combo]`.
|
||
- **2026-05-28 Run #246 sha=`de1c378` PASS 3m53s (S37 Proposal Mig 37+38):** Bundle admin `C9kzTTmq→CGueDk22` + user `CC4DQ-Tr→CEt0QRgX`. Mig 38 AddProposals + 37 ExtendApplicableType. `/api/proposals` 200 empty + workflow `QT-DX-V2-001` ApplicableType=4 seed + 4 Off_DeXuat menu. Stage 4.6 sample seed INFRASTRUCTURE-gated correct (gotcha #51). Tag `[s37, run246, pass, proposal-v2]`.
|
||
- **Archived Run #359/#243/#242/#241/#240 + S35/S36 startup → `archive/2026-05-q4.md` + git d2f52ba (S40 curate):** Run #359 G-O2 Meeting Mig 36 · #243 HrmConfig BE 16 endpoint (BE-only bundle unchanged anti-pattern verify) · #242 FE inline forms 5 satellite · #241 Mig 35 HRM foundation · #240 satellite CRUD. Discovery #7 path-filter eval/** + #8 collection `proj_*`. KEY absorbed in essentials/Stage sections above.
|
||
- **Archived Run #232 (S29 gotcha #51 catch — SeedSampleContractWorkflowV2 nested in demoSeedDisabled → empty V2 dropdown, hoist fix) → `archive/2026-05-q4.md` + git. Smart Friend ROI 4× cumulative (S22 #44 + S25 #48 + S29 ApplicableType + S29 DemoSeed).**
|
||
|
||
---
|
||
|
||
## 🔄 Curate trigger
|
||
- >25KB → archive recent runs → `archive/<period>.md`. Dup failure patterns → merge. Stale >3mo → remove.
|
||
- **Last curate: 2026-05-29 S40 em main proxy** (35.3→~21KB): archived Run #359/#243/#242/#241/#240 + S35/S36 startup → q4 + git d2f52ba; refreshed stale 120→130 test + Mig 34→40 + Stage 3 111→130. Foundation (gotcha patterns + Stage 0-5 + Stage 4.6 + 10-point + Discovery #4-8) preserved. Prev: S34 q3 · S32 q2 · S22 runs.
|