fix(scheduler): resolve SetSnapshot race conditions and remove usage throttle

Backend: Fix three race conditions in SetSnapshot that caused account
scheduling anomalies and broken sticky sessions:
- Use Lua CAS script for atomic version activation, preventing version
  rollback when concurrent goroutines write snapshots simultaneously
- Add UnlockBucket to release rebuild lock immediately after completion
  instead of waiting 30s TTL expiry
- Replace immediate DEL of old snapshots with 60s EXPIRE grace period,
  preventing readers from hitting empty ZRANGE during version switches

Frontend: Remove serial queue throttle (1-2s delay per request) from
usage loading since backend now uses passive sampling. All usage
requests execute immediately in parallel.
This commit is contained in:
shaw
2026-04-29 22:48:39 +08:00
parent 40feb86ba4
commit 8bf2a7b88a
7 changed files with 91 additions and 95 deletions

View File

@@ -544,6 +544,9 @@ func (s *SchedulerSnapshotService) rebuildBucket(ctx context.Context, bucket Sch
if !ok {
return nil
}
defer func() {
_ = s.cache.UnlockBucket(ctx, bucket)
}()
rebuildCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()