0007. BullMQ Repeatable Jobs vs NestJS Schedule

Date: 2025-04-25 Status: Proposed Deciders: [Max (Tech Lead), Backend Lead, DevOps Engineer, Product Manager] Tags: [nova], [bullmq], [nestjs], [architecture]

Context

We need a reliable, cluster-safe job scheduler for recurring tasks (daily digests, user-defined reminders, heavy reports).
Two candidate approaches:
1. BullMQ repeatable (a.k.a. Job Scheduler in ≥ v5.16) — jobs defined once, persisted in Redis, executed once per interval across the whole fleet. (What is BullMQ | BullMQ, What is BullMQ | BullMQ)
2. @nestjs/schedule decorators — cron/interval/timeout triggers that run in-process on every Node instance. (NestJS Documentation)
Key constraints:
- Service runs as multiple replicas behind a load-balancer.
- Tasks must survive container restarts & deployments.
- We already operate a Redis cluster for caching.
- Observability (metrics/UI) and back-pressure are nice-to-have.

Decision

We will adopt BullMQ repeatable jobs for all recurring tasks.

Why this option wins

Factor	BullMQ	NestJS Schedule
Single execution in multi-pod deployments	✔ (queue-based locking)	✖ (every pod fires)
Persistence across restarts	✔ stored in Redis	✖ in-memory only
Retries / back-off / concurrency caps	Built-in	DIY
Monitoring UI	Bull Board / Taskforce	Logs only
Extra infra	Redis already exists	None

The decisive levers were exactly-once semantics across the cluster and survivability after redeploys; both are critical for user-visible digests and billing jobs.

Consequences

Positive

Reliability & idempotence in a horizontally-scaled environment.
Operational insight via Bull Board dashboards and events.
Unified queueing: same stack can host one-off heavy jobs, not just cron.

Negative

Redis is a hard dependency for the service layer; local dev needs Docker or a stub.
Slightly higher cognitive load: queue concepts, workers, back-off, etc.
Need to migrate existing @Cron() decorators (≈ 6) into Bull jobs.

Neutral

Implementation: introduce a jobs module, register Queue + QueueScheduler, add a worker pod.
Deployment: add readiness/liveness probes for the worker; memory tune Redis TTLs.
Docs: write run-book for re-playing or removing repeatable configs.

References

BullMQ Docs – Repeatable Jobs (What is BullMQ | BullMQ)
BullMQ Docs – Job Schedulers (≥ v5.16) (What is BullMQ | BullMQ)
NestJS Docs – Task Scheduling Module (NestJS Documentation)

Context​

Decision​

Consequences​

Positive​

Negative​

Neutral​

References​