[CLAUDE] Phase5 prep: production infra + deploy scripts + 4 guides + FE refresh token
Backend production infra:
- Packages: Serilog.Sinks.File, HealthChecks.EntityFrameworkCore (RateLimiting built-in .NET 10)
- appsettings.Production.json MOI: placeholder __SET_VIA_SECRETS__, AllowedOrigins, Serilog File sink rolling daily retention 30d, RateLimit config
- appsettings.json + Development.json: them Serilog WriteTo Console
- Program.cs REWRITE:
- Serilog ReadFrom.Configuration (prod file / dev console)
- Rate limiter: policy auth-login 5/min/IP (AuthController.Login) + GlobalLimiter 300/min/IP
- Health checks: /health/live liveness (empty predicate) + /health/ready DB probe (AddDbContextCheck)
- HSTS production 1 year
- CORS origins from config AllowedOrigins (default dev 2 localhost)
- AuthController.Login gắn [EnableRateLimiting("auth-login")]
Deploy scripts:
- scripts/deploy-iis.ps1: stop pool → backup current → clean+extract artifact → start pool → health check loop 30s timeout → rollback instruction if fail
- scripts/backup-sql.ps1: BACKUP DATABASE voi INIT+COMPRESSION+CHECKSUM + retention 30d auto cleanup
- .gitea/workflows/deploy.yml MOI: 4 job build BE (Windows) + build 2 FE (Ubuntu, pin .nvmrc 20) + deploy-iis qua WinRM PSSession (secrets IIS_HOST/USER/PASSWORD/JWT_SECRET/DB_CONNECTION)
Docs guides MOI (4 file):
- deployment-iis.md: prereqs (IIS features, Hosting Bundle, SQL, WinRM) + setup lan dau (app pool, 3 site, HTTPS win-acme, user-secrets) + deploy hang ngay (CI/CD + manual) + rollback + monitoring + troubleshooting + SPA web.config sample
- cicd.md: pipeline overview 4 job, secrets setup, runner Windows+Ubuntu, branch strategy, build optimizations, common CI/CD issues
- security-checklist.md: OWASP top 10 2021 mapping voi status + pre go-live checklist + incident response
- runbook.md: daily ops (health/logs), restart/rollback, DB backup/restore/migration revert, user management (reset password, unlock, disable), monitoring (CPU/disk/connection pool), deployment checklist, common gotcha
Frontend refresh token (ca 2 app fe-admin + fe-user):
- lib/api.ts REWRITE: them REFRESH_KEY, axios response interceptor 401 → POST /auth/refresh → retry request goc. Queue pattern cho nhieu request song song chi 1 refresh call chay. Skip retry /auth/login + /auth/refresh tranh infinite loop. _retry flag tren original config.
- contexts/AuthContext.tsx: luu+xoa REFRESH_KEY trong login/logout
E2E verified:
- GET /health/live → 200 Healthy
- GET /health/ready → 200 Healthy (DB probe)
- Rate limit flood 7 POST /auth/login → #1-5 HTTP 400 (cred sai) + #6-7 HTTP 429 Too Many Requests ✅
- TS check fe-admin + fe-user → pass
- dotnet build → 0 errors
Docs updates:
- docs/STATUS.md: Phase 5 prep done, next Phase 5 deploy production + Phase 5.1 security hardening, cumulative stats 8 commits
- docs/HANDOFF.md: phase table them Phase 5 prep row, file tree update voi guides + scripts + workflows, git state commit 8
- docs/changelog/migration-todos.md: tick Phase 5 prep items (12 items done) + Phase 5 deploy items remaining + Phase 5.1 security hardening list
- docs/changelog/sessions/2026-04-21-1530-phase5-prep.md: session log chi tiet
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
195
docs/guides/runbook.md
Normal file
195
docs/guides/runbook.md
Normal file
@ -0,0 +1,195 @@
|
||||
# Runbook — Operations
|
||||
|
||||
> Tác vụ vận hành thường gặp. Copy-paste command khi cần.
|
||||
|
||||
## 1. Daily operations
|
||||
|
||||
### 1.1 Check health
|
||||
```powershell
|
||||
Invoke-WebRequest https://api.solutionerp.local/health/ready -SkipCertificateCheck
|
||||
# → Status 200 "Healthy"
|
||||
```
|
||||
|
||||
### 1.2 Check logs
|
||||
```powershell
|
||||
# Tail log hôm nay
|
||||
Get-Content "C:\inetpub\solution-erp\api\logs\solution-erp-$(Get-Date -Format 'yyyyMMdd').log" -Tail 50 -Wait
|
||||
|
||||
# Grep error
|
||||
Select-String -Path "C:\inetpub\solution-erp\api\logs\*.log" -Pattern "ERR|FTL" -Context 2
|
||||
```
|
||||
|
||||
### 1.3 Check recent failed logins
|
||||
```sql
|
||||
-- Nếu có audit log (Phase 5.1). Hiện chỉ có ContractApprovals → check Serilog file.
|
||||
```
|
||||
|
||||
## 2. Restart / rollback
|
||||
|
||||
### 2.1 Restart Api app pool
|
||||
```powershell
|
||||
Restart-WebAppPool -Name SolutionErpApi
|
||||
```
|
||||
|
||||
### 2.2 Restart toàn bộ IIS (nặng, chỉ khi cần)
|
||||
```powershell
|
||||
iisreset /noforce
|
||||
```
|
||||
|
||||
### 2.3 Rollback deploy
|
||||
```powershell
|
||||
# Deploy script auto-backup vào C:\inetpub\solution-erp\backups\api-{timestamp}
|
||||
Stop-WebAppPool SolutionErpApi
|
||||
$latest = Get-ChildItem "C:\inetpub\solution-erp\backups" | Sort-Object Name -Descending | Select-Object -First 1
|
||||
Copy-Item "$($latest.FullName)\*" -Destination "C:\inetpub\solution-erp\api\" -Recurse -Force
|
||||
Start-WebAppPool SolutionErpApi
|
||||
Invoke-WebRequest https://api.solutionerp.local/health/ready -SkipCertificateCheck # verify
|
||||
```
|
||||
|
||||
## 3. Database
|
||||
|
||||
### 3.1 Manual backup (ngoài daily job)
|
||||
```powershell
|
||||
.\scripts\backup-sql.ps1 -Server "." -Database "SolutionErp" -BackupDir "D:\Backups\SolutionErp-manual"
|
||||
```
|
||||
|
||||
### 3.2 Restore từ backup
|
||||
```sql
|
||||
-- WARNING: Destructive. Stop app trước.
|
||||
USE master;
|
||||
ALTER DATABASE SolutionErp SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
|
||||
RESTORE DATABASE SolutionErp
|
||||
FROM DISK = N'D:\Backups\SolutionErp\SolutionErp_20260421-020000.bak'
|
||||
WITH REPLACE, RECOVERY;
|
||||
ALTER DATABASE SolutionErp SET MULTI_USER;
|
||||
```
|
||||
|
||||
### 3.3 Rollback migration
|
||||
```powershell
|
||||
# List migrations đã apply
|
||||
cd C:\Deploy\staging # nơi có .NET SDK
|
||||
dotnet ef migrations list --project src\Backend\SolutionErp.Infrastructure --startup-project src\Backend\SolutionErp.Api
|
||||
|
||||
# Rollback về migration cụ thể
|
||||
dotnet ef database update <PreviousMigrationName> --project ... --startup-project ...
|
||||
```
|
||||
|
||||
### 3.4 Clear test data (dev)
|
||||
```sql
|
||||
-- Clear toàn bộ Contracts + related (giữ master data)
|
||||
DELETE FROM ContractApprovals;
|
||||
DELETE FROM ContractComments;
|
||||
DELETE FROM ContractAttachments;
|
||||
DELETE FROM Contracts;
|
||||
DELETE FROM ContractCodeSequences;
|
||||
```
|
||||
|
||||
## 4. User management
|
||||
|
||||
### 4.1 Tạo user mới
|
||||
```sql
|
||||
-- Phase 5.1 có FE, hiện manual qua SQL (không khuyến khích — password hash phải đúng format)
|
||||
-- Recommend: tạo qua UserManager trong 1 script .NET, hoặc API `POST /api/users` (chưa implement)
|
||||
```
|
||||
|
||||
### 4.2 Reset password admin (emergency)
|
||||
```powershell
|
||||
# Run script one-off trên server
|
||||
cd C:\inetpub\solution-erp\api
|
||||
dotnet SolutionErp.Api.dll --reset-password admin@solutionerp.local NewPassword@2026
|
||||
# (Feature chưa có — Phase 5.1)
|
||||
```
|
||||
|
||||
Temporary workaround: update `PasswordHash` qua Identity `UserManager` trong code, redeploy.
|
||||
|
||||
### 4.3 Unlock account bị lock
|
||||
```sql
|
||||
UPDATE Users SET LockoutEnd = NULL, AccessFailedCount = 0 WHERE Email = 'user@example.com';
|
||||
```
|
||||
|
||||
### 4.4 Disable user
|
||||
```sql
|
||||
UPDATE Users SET IsActive = 0 WHERE Email = 'user@example.com';
|
||||
-- Note: JWT hiện tại vẫn valid tới hết expiry (1h) — Phase 5.1 cần check IsActive trong middleware
|
||||
```
|
||||
|
||||
## 5. Monitoring + incident
|
||||
|
||||
### 5.1 High CPU app pool
|
||||
```powershell
|
||||
# Identify worker process
|
||||
Get-Process w3wp | Select-Object Id, CPU, WorkingSet64, StartTime
|
||||
# Kill nếu stuck (IIS tự restart)
|
||||
Stop-Process -Id <pid> -Force
|
||||
```
|
||||
|
||||
### 5.2 Out of disk
|
||||
```powershell
|
||||
# Check logs folder
|
||||
Get-ChildItem "C:\inetpub\solution-erp\api\logs" | Sort-Object LastWriteTime | Select -First 20
|
||||
# Delete logs cũ hơn 30 ngày (đã config retention nhưng check)
|
||||
Get-ChildItem "C:\inetpub\solution-erp\api\logs" -Filter "*.log" |
|
||||
Where-Object LastWriteTime -lt (Get-Date).AddDays(-30) | Remove-Item
|
||||
```
|
||||
|
||||
### 5.3 Suspected brute-force attack
|
||||
```powershell
|
||||
# Grep 401 qua IIS log
|
||||
Get-Content C:\inetpub\logs\LogFiles\W3SVC1\*.log -Tail 5000 |
|
||||
Select-String " 401 " | Group-Object { ($_ -split ' ')[8] } |
|
||||
Sort-Object Count -Descending | Select -First 10
|
||||
# Nếu thấy IP suspicious → block IIS IP Restriction hoặc firewall rule
|
||||
```
|
||||
|
||||
### 5.4 DB connection pool exhausted
|
||||
```sql
|
||||
-- Check active connections
|
||||
SELECT DB_NAME(dbid) AS DB, COUNT(*) AS Connections, loginame AS Login
|
||||
FROM sys.sysprocesses
|
||||
WHERE dbid > 0
|
||||
GROUP BY dbid, loginame
|
||||
ORDER BY 2 DESC;
|
||||
|
||||
-- Kill connection cụ thể nếu stuck
|
||||
KILL <spid>;
|
||||
```
|
||||
|
||||
## 6. Deployment checklist
|
||||
|
||||
Trước khi deploy:
|
||||
- [ ] Backup DB (manual nếu chưa auto chạy)
|
||||
- [ ] Note commit SHA đang live
|
||||
- [ ] Check CI/CD passed all checks
|
||||
- [ ] Notify team trong Slack/Teams (nếu có downtime)
|
||||
|
||||
Sau deploy:
|
||||
- [ ] Health check `/health/ready` → 200
|
||||
- [ ] Smoke test: login + list HĐ + export Excel
|
||||
- [ ] Check log 5 phút đầu không có ERR
|
||||
- [ ] Monitor CPU/RAM 15 phút
|
||||
|
||||
## 7. Common "gotcha" vận hành
|
||||
|
||||
| Symptom | Fix |
|
||||
|---|---|
|
||||
| App pool crash rapid fail sau deploy | Disable temp: `Set-ItemProperty IIS:\AppPools\SolutionErpApi -Name failure.rapidFailProtection -Value false` — debug log → enable lại |
|
||||
| User bị logout mass sau deploy | Check Jwt:Secret có đổi không — rotate secret → buộc mọi user login lại (expected nếu intentional) |
|
||||
| Migration fail "connection string" | Check user secrets / env var chưa set trong app pool advanced settings |
|
||||
| FE trắng trang | F12 console check path — thường do `base` trong vite.config.ts khác env, hoặc missing web.config SPA rewrite |
|
||||
| Export Excel 500 | Check `wwwroot/templates` có đủ 5 file .docx/.xlsx không — ClosedXML fail khi template missing |
|
||||
|
||||
## 8. Escalation contacts
|
||||
|
||||
| Role | Name | Contact |
|
||||
|---|---|---|
|
||||
| Dev lead | pqhuy@solutions.local | pqhuy1987@gmail.com |
|
||||
| DBA | TBD | — |
|
||||
| On-call 24/7 | TBD | — |
|
||||
|
||||
## 9. Liên quan
|
||||
|
||||
- [`deployment-iis.md`](deployment-iis.md) — setup chi tiết
|
||||
- [`cicd.md`](cicd.md) — CI/CD pipeline
|
||||
- [`security-checklist.md`](security-checklist.md) — incident response
|
||||
- [`../gotchas.md`](../gotchas.md) — bẫy dev + ops
|
||||
- [`../database/database-guide.md`](../database/database-guide.md) — backup/restore detail
|
||||
Reference in New Issue
Block a user