Radio Track Play & Sync Architecture Review

Date: Feb 14, 2026 Scope: worker/src/radio.ts (~1045 lines), .vitepress/components/RadioPage.vue (~2537 lines), related migrations and routes

How It Works Today

The system simulates "live radio" using a clock-sync approach:

Server (computeNowPlaying): fetches all tracks for a station from D1, applies a deterministic seeded shuffle, then uses (wallClock - fixedEpoch) % totalPlaylistDuration to walk the playlist and find the current track + seek offset.
Client (RadioPage.vue): polls /radio/now-playing/:stationId, seeks the <audio> element to the server-provided offset, and manages crossfade/preload/recovery locally.
Presence: heartbeat POSTs update a KV key per station; stale entries are pruned on read.
Chat: polling at 5s intervals against the existing chat system.
Reactions: stored in D1, counts fetched per track on each now-playing request.
Trim detection: peaks.json files in R2 are read on each request to compute silence-trimmed boundaries.
Backfill jobs: cron tasks generate AI track names (OpenAI) and lyrics alignment (ElevenLabs) for tracks missing them.

Fundamental Issue: Playlist Instability

The core sync model has a structural flaw. The "now playing" position is derived from:

typescript

const totalDuration = playlist.reduce((sum, t) => sum + t.duration, 0)
const elapsed = ((now - CYCLE_START) % totalDuration + totalDuration) % totalDuration

Any change to the playlist invalidates every listener's position. When a new track is generated and classified into a station (or a track's trim values change), totalDuration shifts, and the modulo calculation jumps to a completely different position in the playlist. Every listener on that station would hear a sudden discontinuity on their next poll.

There is no playlist versioning. Two requests a fraction of a second apart can return different playlists if a new track just completed generation. The server recomputes from scratch on every request.

Severity: High. This is the single biggest reliability concern. It won't surface with a static playlist, but as generation activity increases, discontinuities will become frequent.

Issue: R2 Reads on Every Request

Even though D1 has trim_start_seconds and trim_end_seconds columns, handleRadioNowPlaying fetches peaks.json from R2 on every call. For a single now-playing request, it performs up to 3 R2 reads (current track, next track, possibly after-next in the boundary-advance case).

The DB already reads the trim columns in getStationTracks:

typescript

const dbTrimStart = typeof r.trim_start_seconds === 'number' ? r.trim_start_seconds : 0
const dbTrimEnd = typeof r.trim_end_seconds === 'number' ? r.trim_end_seconds : rawDur

But then immediately overrides them with fresh R2 reads:

typescript

const [currentTrim, nextTrim, reactionCounts] = await Promise.all([
  fetchTrimForTrack(env.GENAI_ASSETS, nowPlaying.track),
  nowPlaying.nextTrack
    ? fetchTrimForTrack(env.GENAI_ASSETS, nowPlaying.nextTrack)
    : Promise.resolve(null),
  getTrackReactionCounts(env.DB, nowPlaying.track.id),
])

nowPlaying.track.trimStart = currentTrim.trimStart
nowPlaying.track.trimEnd = currentTrim.trimEnd

The R2 reads make sense as a backfill for tracks that don't have trim values in D1 yet, but they shouldn't be the primary path.

Severity: Medium. Adds unnecessary latency and R2 cost per request.

Issue: Boundary-Advance Is a Symptom, Not a Fix

The code at lines 918-943 detects when the computed seek offset is within 3 seconds of the track's end and manually advances to the next track. This exists because the clock-sync model can place you at a track boundary, which would cause the client to load a track that immediately ends.

typescript

const BOUNDARY_BUFFER = 3
if (nowPlaying.seekOffset >= nowPlaying.track.trimEnd - BOUNDARY_BUFFER && nowPlaying.nextTrack) {
  // ... manually advance to next track, re-shuffle, fetch another R2 trim
}

This is a patch for the underlying problem: the server doesn't manage track transitions as events, it recomputes position from scratch each time. The boundary-advance also triggers an additional R2 read and a second seeded shuffle of the full track list.

Severity: Medium. Creates a second code path for transitions that differs from the primary playlist walk.

Issue: KV Listener Presence Doesn't Scale

getAllListenerCounts performs 8 sequential KV reads (one per station) on every /radio/stations request:

typescript

async function getAllListenerCounts(kv: KVNamespace): Promise<Record<string, number>> {
  const counts: Record<string, number> = {}
  for (const station of STATIONS) {
    const listeners = await getListeners(kv, station.id)
    counts[station.id] = listeners.length
  }
  return counts
}

KV is eventually consistent, so listener counts can be stale or inconsistent across requests. Each listener entry includes full user data (name, picture) serialized as JSON, and the entire array is read and rewritten on every heartbeat.

Severity: Medium. Works at low scale but will degrade with more stations or listeners.

Issue: Client-Side Complexity

RadioPage.vue is a single 1660-line <script setup> managing 8+ concerns simultaneously:

Audio playback (element lifecycle, seek, volume)
Crossfade transitions (preload, fade animation via requestAnimationFrame)
Recovery (8 different trigger paths)
Lyrics sync (alignment parsing, word-level highlighting, auto-scroll)
Chat (polling, sending, scroll-to-bottom)
Heartbeat (30s interval)
Reactions (optimistic UI, floating emoji animation)
Sign-in prompt (timer-based, volume reduction)
WaveSurfer visualization
Offline/online detection

The recovery system alone has paths triggered by: audio-error-{code}, audio-error-generic, network-restored, visibility-restored, play-failed-on-visible, watchdog-stall, unexpected-pause, and no-track. These all funnel into one recoverPlayback function, which is good, but the proliferation of entry points suggests the root causes aren't well-understood.

Severity: Medium. Makes the codebase hard to test, debug, or modify safely.

Issue: No Reaction Deduplication or Rate Limiting

The /radio/react endpoint creates a new row for every call with no throttle or uniqueness check:

typescript

async function addReaction(db, stationId, trackId, userId, reaction) {
  const id = crypto.randomUUID()
  await db.prepare(
    `INSERT INTO radio_reactions (id, station_id, track_id, user_id, reaction) VALUES (?, ?, ?, ?, ?)`
  ).bind(id, stationId, trackId, userId, reaction).run()
}

The same user can submit unlimited reactions for the same track. The idx_rr_user index exists on (user_id, track_id) but isn't used for dedup.

Severity: Low-medium. Could be exploited to inflate counts or fill the DB.

Issue: Genre Classifier Is Fragile

classifyGenre does naive substring matching with order-dependent first-match-wins:

typescript

export function classifyGenre(prompt: string): string {
  const lower = prompt.toLowerCase()
  for (const station of STATIONS) {
    if (station.id === 'mixed') continue
    for (const kw of station.keywords) {
      if (lower.includes(kw)) return station.id
    }
  }
  return 'mixed'
}

A prompt like "dark electronic beats" would match "cinematic" (because "dark" appears in cinematic's keywords before electronic's), not "electronic." There's a trailing space in 'war ' (likely to avoid matching "warm") but this would miss "war-themed" or "war."

Severity: Low. Misclassifications put tracks in wrong stations but don't break functionality.

Issue: Chat Polling

Chat uses 5-second HTTP polling. This means up to 5s latency on messages, and every connected client makes a request every 5 seconds regardless of activity. For a radio page that users keep open for long periods, this adds up.

Severity: Low. Functional but wasteful.

What I'd Do Differently

1. Pre-computed Playlist Schedules

Run a cron job (every 5-10 min) that generates a versioned playlist schedule per station. Store it in KV as a JSON document containing: track order, cumulative start times, and a version hash.

The now-playing endpoint becomes a simple lookup: read the schedule from KV, binary-search for the current position. When the playlist changes, the old schedule stays active until the cron produces the new one, so all listeners transition together at a known boundary.

This eliminates the fundamental instability: playlist changes no longer cause position jumps for active listeners.

2. Use D1 Trim Values as Source of Truth

Backfill any tracks that have null trim columns from R2 peaks in a cron job (the pattern already exists for track names and lyrics alignment). Then remove the per-request R2 reads entirely. This cuts request latency significantly and eliminates the redundant double-computation of trim values.

3. Durable Objects for Real-Time State

One Durable Object per station, holding: current listener set, recent reactions, and chat messages. Clients connect via WebSocket instead of polling. Benefits:

Real-time chat (no 5s delay)
Accurate listener counts (no KV eventual consistency)
Broadcast track changes to all listeners simultaneously
Server can push "next track" info instead of clients discovering it independently
Rate limiting on reactions becomes trivial (per-connection state)

4. Decompose the Client

Split RadioPage.vue into focused composables:

useRadioPlayer - audio element, seek, volume, play/pause
useRadioCrossfade - preload logic, fade animation
useRadioRecovery - reconnection, watchdog, offline detection
useRadioLyrics - alignment parsing, word highlighting, scroll sync
useRadioChat - messages, sending, scroll management
useRadioPresence - heartbeat, listener list

The page component becomes orchestration only. Each composable is testable in isolation.

5. Server-Side Track Quality Gating

Before a track enters radio rotation, validate:

Minimum playable duration > 10s
Peak amplitude above a threshold across the track
Trim window is reasonable (not trimming 80%+ of the track)
Flag low-quality tracks for manual review rather than auto-adding

6. Rate-Limit Reactions

Add a UNIQUE(user_id, track_id, reaction) constraint or at minimum a time-based throttle (one reaction per user per track per type). The optimistic UI update on the client is fine, but the server should reject duplicates.

7. Embedding-Based Genre Classifier

Since the project already uses OpenAI for track names, a lightweight classification call (or even a small embedding similarity check against station descriptions) would be more accurate and maintainable than the keyword list. This could run in the same cron batch as track name generation.

Appendix: Durable Objects + WebSocket Hibernation Plan

Standard CF Workers are stateless and short-lived, so they can't hold persistent WebSocket connections. The solution is Durable Objects with the WebSocket Hibernation API. Each station gets one DO instance. The Worker routes WebSocket upgrades to the correct DO.

wrangler.toml additions

toml

[durable_objects]
bindings = [
  { name = "RADIO_STATION", class_name = "RadioStationDO" }
]

[[migrations]]
tag = "v1"
new_classes = ["RadioStationDO"]

Worker router (added to existing fetch handler)

typescript

if (path.match(/^\/radio\/ws\/[^/]+$/) && request.headers.get("Upgrade") === "websocket") {
  const stationId = path.split("/").pop()!
  const id = env.RADIO_STATION.idFromName(stationId)
  const stub = env.RADIO_STATION.get(id)
  return stub.fetch(request)
}

idFromName(stationId) means each station ID (like "electronic") deterministically maps to one DO instance. All listeners for that station connect to the same object.

Durable Object class (RadioStationDO)

The key is the WebSocket Hibernation API. Without it, a DO holding 200 idle connections gets billed for wall-clock time while nothing happens. With hibernation (state.acceptWebSocket instead of manual socket handling), Cloudflare suspends the DO between events. You're billed only for:

webSocketMessage invocations (when someone chats or reacts)
alarm invocations (when tracks rotate)
Not the idle time between those events

For 200 listeners where tracks change every 3 minutes and someone chats every few seconds, you'd get a handful of DO invocations per second instead of continuous billing.

typescript

export class RadioStationDO implements DurableObject {
  private state: DurableObjectState
  private env: Env

  constructor(state: DurableObjectState, env: Env) {
    this.state = state
    this.env = env
  }

  async fetch(request: Request): Promise<Response> {
    const url = new URL(request.url)

    if (request.headers.get("Upgrade") === "websocket") {
      const pair = new WebSocketPair()
      const [client, server] = Object.values(pair)

      const userId = await this.authenticateRequest(request)
      // Hibernation API: acceptWebSocket with tags for identification
      this.state.acceptWebSocket(server, [userId || "anon"])

      // Push current state immediately on connect
      const nowPlaying = await this.getCurrentTrack()
      server.send(JSON.stringify({ type: "now-playing", data: nowPlaying }))
      server.send(JSON.stringify({
        type: "listeners",
        data: { count: this.state.getWebSockets().length }
      }))

      return new Response(null, { status: 101, webSocket: client })
    }

    // Internal calls (e.g., cron triggers track rotation)
    if (url.pathname.endsWith("/rotate")) {
      await this.rotateTrack()
      return new Response("ok")
    }

    return new Response("Expected WebSocket", { status: 400 })
  }

  // Hibernation API handler: fires when any connected socket sends a message
  async webSocketMessage(ws: WebSocket, message: string) {
    const msg = JSON.parse(message as string)
    switch (msg.type) {
      case "chat":
        this.broadcast({ type: "chat", data: msg.data })
        break
      case "reaction":
        this.broadcast({ type: "reaction", data: msg.data })
        break
      case "heartbeat":
        // No-op. Socket liveness is handled by hibernation ping/pong.
        break
    }
  }

  // Hibernation API handler: fires when a socket disconnects
  async webSocketClose(ws: WebSocket) {
    this.broadcast({
      type: "listeners",
      data: { count: this.state.getWebSockets().length }
    })
  }

  // Hibernation API handler: fires at the scheduled time
  async alarm() {
    await this.rotateTrack()
    const track = await this.getCurrentTrack()
    if (track) {
      const remainingMs = (track.duration - track.elapsed) * 1000
      this.state.storage.setAlarm(Date.now() + remainingMs)
    }
  }

  private broadcast(msg: object) {
    const payload = JSON.stringify(msg)
    for (const ws of this.state.getWebSockets()) {
      try { ws.send(payload) } catch { ws.close() }
    }
  }

  private async rotateTrack() {
    const next = await this.getNextTrack()
    this.broadcast({ type: "track-change", data: next })
  }
}

What this replaces

Current approach	With Durable Objects
Client polls `/now-playing` to discover track changes	DO pushes `track-change` to all sockets simultaneously
KV-based listener presence (eventual consistency, 8 reads per /stations)	`state.getWebSockets().length` is exact and instant
Chat polling every 5s via HTTP	Chat messages broadcast immediately via WebSocket
Heartbeat POST every 30s	WebSocket liveness is automatic (ping/pong)
Client recovery for missed track transitions	Server pushes transitions, client just obeys

Track rotation via alarms

Instead of every client independently computing "what track is playing now" from wall-clock math, the DO owns track timing. When a track starts, the DO sets an alarm for when it ends:

typescript

this.state.storage.setAlarm(Date.now() + trackDurationMs)

When the alarm fires, rotateTrack() advances to the next track and broadcasts the change to all listeners at once. All clients transition together, with no clock drift or modulo instability.

Migration path

This doesn't need to be all-or-nothing:

Add the DO class and wrangler config
Keep existing HTTP endpoints working as-is
Client tries WebSocket first, falls back to HTTP polling if connection fails
Migrate features incrementally: presence first (simplest), then chat, then track sync
Once WebSocket path is stable, remove the polling fallback

Cost model

Durable Objects pricing (as of early 2026):

Requests: $0.15 per million
Duration: $12.50 per million GB-s (only when active, not hibernating)
Storage: $0.20 per GB-month

For a radio feature with ~50-200 concurrent listeners per station and 8 stations, the cost would be minimal. The hibernation API means you only pay for actual message processing, not for holding idle connections. This would likely be cheaper than the current approach of every client making HTTP requests every 5-30 seconds.

Priority Order

Priority	Change	Effort	Impact
1	Pre-computed playlist schedules	Medium	Fixes the fundamental sync instability
2	Use D1 trims, drop per-request R2 reads	Low	Reduces latency and cost immediately
3	Reaction dedup/rate limiting	Low	Prevents abuse
4	Decompose client into composables	Medium	Maintainability, testability
5	Durable Objects + WebSockets	High	Real-time features, accurate presence
6	Track quality gating	Low	Prevents bad tracks from disrupting playback
7	Better genre classification	Low	Improves station relevance

Summary

The current architecture works for low-to-moderate traffic with a slowly-changing playlist, but it has a fundamental fragility: the sync model breaks on playlist changes, and the server does redundant work on every request. The client compensates for server-side gaps with complex recovery logic.

The path to long-term stability is: pre-computed schedules (fixes sync), persistent connections via Durable Objects (fixes presence/chat/reactions), cached trim data in D1 (fixes latency), and a decomposed client (fixes maintainability). The recovery logic can then be simplified dramatically because most of the failure modes it handles today stem from the server and client disagreeing about state.

Radio Track Play & Sync Architecture Review ​

How It Works Today ​

Fundamental Issue: Playlist Instability ​

Issue: R2 Reads on Every Request ​

Issue: Boundary-Advance Is a Symptom, Not a Fix ​

Issue: KV Listener Presence Doesn't Scale ​

Issue: Client-Side Complexity ​

Issue: No Reaction Deduplication or Rate Limiting ​

Issue: Genre Classifier Is Fragile ​

Issue: Chat Polling ​

What I'd Do Differently ​

1. Pre-computed Playlist Schedules ​

2. Use D1 Trim Values as Source of Truth ​

3. Durable Objects for Real-Time State ​

4. Decompose the Client ​

5. Server-Side Track Quality Gating ​

6. Rate-Limit Reactions ​

7. Embedding-Based Genre Classifier ​

Appendix: Durable Objects + WebSocket Hibernation Plan ​

wrangler.toml additions ​

Worker router (added to existing fetch handler) ​

Durable Object class (RadioStationDO) ​

What this replaces ​

Track rotation via alarms ​

Migration path ​

Cost model ​

Priority Order ​

Summary ​

Radio Track Play & Sync Architecture Review

How It Works Today

Fundamental Issue: Playlist Instability

Issue: R2 Reads on Every Request

Issue: Boundary-Advance Is a Symptom, Not a Fix

Issue: KV Listener Presence Doesn't Scale

Issue: Client-Side Complexity

Issue: No Reaction Deduplication or Rate Limiting

Issue: Genre Classifier Is Fragile

Issue: Chat Polling

What I'd Do Differently

1. Pre-computed Playlist Schedules

2. Use D1 Trim Values as Source of Truth

3. Durable Objects for Real-Time State

4. Decompose the Client

5. Server-Side Track Quality Gating

6. Rate-Limit Reactions

7. Embedding-Based Genre Classifier

Appendix: Durable Objects + WebSocket Hibernation Plan

wrangler.toml additions

Worker router (added to existing fetch handler)

Durable Object class (RadioStationDO)

What this replaces

Track rotation via alarms

Migration path

Cost model

Priority Order

Summary