Skip to content

System Architecture

rust-srec is an automated stream recorder built around a clear separation of concerns:

  • A control plane (REST API + configuration + orchestration)
  • A data plane (live status detection + downloads + danmu + post-processing)
  • A persistence layer (SQLite + filesystem outputs)

It is implemented as a set of long-running Tokio services managed by the runtime ServiceContainer.

System Architecture Overview

High-level topology

Runtime root: ServiceContainer

The ServiceContainer (in rust-srec/src/services/container.rs) wires everything together:

  • Initializes repositories and services (DB, config cache, managers)
  • Starts background tasks (scheduler actors, pipeline workers, outbox flushers)
  • Subscribes to event streams and forwards events between services
  • Owns the CancellationToken used for graceful shutdown

This gives the project one clear place to reason about lifecycle, dependencies, and shutdown order.

Core components (what each one actually does)

ConfigService (configuration + hot reload)

ConfigService is the configuration control plane. It loads and merges a 4-level hierarchy:

  1. Global defaults
  2. Platform configuration
  3. Template configuration
  4. Streamer-specific overrides

It also caches merged results and broadcasts ConfigUpdateEvent so runtime services can respond to changes without a restart.

See also: Configuration

StreamerManager (runtime state source of truth)

StreamerManager maintains the in-memory streamer metadata used by orchestration and downloads, with write-through persistence to SQLite.

Important correctness detail: on startup it performs restart recovery by resetting any streamers left in Live back to NotLive, so the normal NotLive → Live edge can trigger downloads again.

Scheduler (actor model orchestration)

The scheduler is a supervisor that manages self-scheduling actors:

  • StreamerActor: owns the timing and state loop for one streamer
  • PlatformActor: coordinates batch detection for batch-capable platforms
  • Supervisor: handles actor lifecycle, restart tracking, and shutdown reporting

Actors call into StreamMonitor for real status checks; the scheduler also reacts to configuration events to spawn/stop actors dynamically.

StreamMonitor (detect + filter + sessions + outbox)

StreamMonitor is the data-plane detector. It:

  • Resolves platform information and checks live status
  • Applies filters (time/keyword/category, etc.)
  • Creates/ends sessions and updates streamer state transactionally
  • Emits MonitorEvent via a DB-backed outbox for consistency

Outbox pattern: Monitor events are written in the same DB transaction as state/session updates, then a background task flushes the outbox to a Tokio broadcast channel. This reduces the chance of “state changed but event lost” during crashes or restarts.

DownloadManager (downloads + engine abstraction)

The download manager owns:

  • Concurrency limits (including extra slots for high priority downloads)
  • Retry and circuit breaker logic per engine/config key
  • Engine abstraction:
    • External processes: ffmpeg, streamlink
    • In-process Rust engine: mesio

It emits DownloadManagerEvent for lifecycle, segment boundaries, and (optionally) progress.

For persisted session segments, the backend keeps three separate timestamps:

  • created_at: when the segment started recording
  • completed_at: when the segment finished recording
  • persisted_at: when the segment metadata row was stored in SQLite

DanmuService (chat capture)

Danmu collection is session-scoped but writes files per segment:

  • A websocket connection stays alive for the session
  • Segment boundaries (from download events) open/close danmu files (e.g. XML)
  • Danmu events are forwarded to the pipeline for paired/session coordination

PipelineManager (job queue + DAG + worker pools)

The pipeline manager is the post-processing engine:

  • Maintains a DB-backed job queue (with recovery on restart)
  • Executes a DAG pipeline model (fan-in / fan-out)
  • Uses separate worker pools for CPU-bound and IO-bound processors
  • Coordinates multi-stage triggers:
    • Segment pipelines (single output file)
    • Paired-segment pipelines (video + danmu for the same segment index)
    • Session-complete pipelines (once all segments are complete)

See also: DAG Pipeline

NotificationService (event fan-out)

Notifications subscribe to monitor/download/pipeline events and deliver them to configured channels (Discord / Email / Webhook), with retry, circuit breakers, and dead-letter persistence.

See also: Notifications

Key flows

Recording lifecycle (end-to-end)

API request flow (control plane)

Event-driven communication

Most cross-service coordination happens via Tokio broadcast channels.

StreamPublisherTypical consumersNotes
ConfigUpdateEventConfigServiceScheduler, DownloadManager, PipelineManager, ServiceContainerDrives hot reload and cleanup when streamers become inactive
MonitorEventStreamMonitorServiceContainer, NotificationServiceEmitted through the DB outbox (best-effort delivery under restarts)
DownloadManagerEventDownloadManagerPipelineManager, Scheduler, NotificationServiceSegment boundaries are the main trigger for pipelines
DanmuEventDanmuServicePipelineManagerUsed for paired-segment and session-complete coordination
PipelineEventPipelineManagerNotificationServiceJob lifecycle events for observability

About throttling

PipelineManager contains an optional throttling subsystem (ThrottleController) that can emit events and apply download concurrency adjustments if a DownloadLimitAdjuster is wired in.

Output-root write gate

The download manager runs an output-root write gate (in downloader::output_root_gate) that operates at the filesystem boundary, complementing the engine-level circuit breakers that operate at the network/process boundary. It exists so that a single filesystem failure (disk full, stale bind mount, lost permissions) does not cascade into dozens of per-streamer retries that would flood the logs and DB outbox.

Healthy ──(record_failure: pre-start ENOENT / runtime ENOSPC / startup probe)──► Degraded

                              (mark_healthy: next real ensure_output_dir succeeds)  │
Healthy ◄───────────────────────────────────────────────────────────────────────────┘

Key properties:

  • Lock-free fast path. check() on a Healthy root is an atomic load plus a DashMap::get. No mutex on the hot path, no cost when there are no tracked failures.
  • Single-flight cooldown via CAS. When a root is Degraded, only one caller per cooldown window (30s default) is allowed through to attempt the real create_dir_all. Other concurrent callers fast-reject with the cached error. Mirrors the half-open pattern in CircuitBreaker.
  • No background probe task. The real ensure_output_dir call is the probe — the gate piggybacks on actual download attempts. A single one-shot probe runs at container startup to surface broken mounts from second zero.
  • Recovery hook. On Degraded → Healthy transition the gate clears consecutive_error_count, disabled_until, and last_error for every streamer whose backoff was caused by the gate (filtered by the "output-root blocked:" prefix). The whole affected fleet cascades out of backoff on the same tick.
  • One notification per transition. The Healthy → Degraded CAS is also what decides which caller emits the critical output_path_inaccessible notification, so users see exactly one alert per incident regardless of how many concurrent streamers are affected.

Exposed in /health as a single aggregated output-root component listing each Degraded root with its classified io::ErrorKind, rejected count, and staleness. See the notifications doc for the event shape and the Docker troubleshooting guide for the stale-mount failure mode.

Observability, health, and shutdown

  • Logging uses tracing with a reloadable filter and log retention cleanup
  • Health endpoints:
    • GET /api/health/live (no auth; suitable for container liveness)
    • GET /api/health and GET /api/health/ready (require auth when JWT is configured)
  • Shutdown:
    • The ServiceContainer holds a CancellationToken and propagates it to background tasks
    • SIGINT/SIGTERM triggers a graceful shutdown sequence

Released under the MIT License.