Radio Track Play & Sync Architecture Review
Date: Feb 14, 2026 Scope: worker/src/radio.ts (~1045 lines), .vitepress/components/RadioPage.vue (~2537 lines), related migrations and routes
How It Works Today
The system simulates "live radio" using a clock-sync approach:
- Server (
computeNowPlaying): fetches all tracks for a station from D1, applies a deterministic seeded shuffle, then uses(wallClock - fixedEpoch) % totalPlaylistDurationto walk the playlist and find the current track + seek offset. - Client (
RadioPage.vue): polls/radio/now-playing/:stationId, seeks the<audio>element to the server-provided offset, and manages crossfade/preload/recovery locally. - Presence: heartbeat POSTs update a KV key per station; stale entries are pruned on read.
- Chat: polling at 5s intervals against the existing chat system.
- Reactions: stored in D1, counts fetched per track on each now-playing request.
- Trim detection: peaks.json files in R2 are read on each request to compute silence-trimmed boundaries.
- Backfill jobs: cron tasks generate AI track names (OpenAI) and lyrics alignment (ElevenLabs) for tracks missing them.
Fundamental Issue: Playlist Instability
The core sync model has a structural flaw. The "now playing" position is derived from:
const totalDuration = playlist.reduce((sum, t) => sum + t.duration, 0)
const elapsed = ((now - CYCLE_START) % totalDuration + totalDuration) % totalDurationAny change to the playlist invalidates every listener's position. When a new track is generated and classified into a station (or a track's trim values change), totalDuration shifts, and the modulo calculation jumps to a completely different position in the playlist. Every listener on that station would hear a sudden discontinuity on their next poll.
There is no playlist versioning. Two requests a fraction of a second apart can return different playlists if a new track just completed generation. The server recomputes from scratch on every request.
Severity: High. This is the single biggest reliability concern. It won't surface with a static playlist, but as generation activity increases, discontinuities will become frequent.
Issue: R2 Reads on Every Request
Even though D1 has trim_start_seconds and trim_end_seconds columns, handleRadioNowPlaying fetches peaks.json from R2 on every call. For a single now-playing request, it performs up to 3 R2 reads (current track, next track, possibly after-next in the boundary-advance case).
The DB already reads the trim columns in getStationTracks:
const dbTrimStart = typeof r.trim_start_seconds === 'number' ? r.trim_start_seconds : 0
const dbTrimEnd = typeof r.trim_end_seconds === 'number' ? r.trim_end_seconds : rawDurBut then immediately overrides them with fresh R2 reads:
const [currentTrim, nextTrim, reactionCounts] = await Promise.all([
fetchTrimForTrack(env.GENAI_ASSETS, nowPlaying.track),
nowPlaying.nextTrack
? fetchTrimForTrack(env.GENAI_ASSETS, nowPlaying.nextTrack)
: Promise.resolve(null),
getTrackReactionCounts(env.DB, nowPlaying.track.id),
])
nowPlaying.track.trimStart = currentTrim.trimStart
nowPlaying.track.trimEnd = currentTrim.trimEndThe R2 reads make sense as a backfill for tracks that don't have trim values in D1 yet, but they shouldn't be the primary path.
Severity: Medium. Adds unnecessary latency and R2 cost per request.
Issue: Boundary-Advance Is a Symptom, Not a Fix
The code at lines 918-943 detects when the computed seek offset is within 3 seconds of the track's end and manually advances to the next track. This exists because the clock-sync model can place you at a track boundary, which would cause the client to load a track that immediately ends.
const BOUNDARY_BUFFER = 3
if (nowPlaying.seekOffset >= nowPlaying.track.trimEnd - BOUNDARY_BUFFER && nowPlaying.nextTrack) {
// ... manually advance to next track, re-shuffle, fetch another R2 trim
}This is a patch for the underlying problem: the server doesn't manage track transitions as events, it recomputes position from scratch each time. The boundary-advance also triggers an additional R2 read and a second seeded shuffle of the full track list.
Severity: Medium. Creates a second code path for transitions that differs from the primary playlist walk.
Issue: KV Listener Presence Doesn't Scale
getAllListenerCounts performs 8 sequential KV reads (one per station) on every /radio/stations request:
async function getAllListenerCounts(kv: KVNamespace): Promise<Record<string, number>> {
const counts: Record<string, number> = {}
for (const station of STATIONS) {
const listeners = await getListeners(kv, station.id)
counts[station.id] = listeners.length
}
return counts
}KV is eventually consistent, so listener counts can be stale or inconsistent across requests. Each listener entry includes full user data (name, picture) serialized as JSON, and the entire array is read and rewritten on every heartbeat.
Severity: Medium. Works at low scale but will degrade with more stations or listeners.
Issue: Client-Side Complexity
RadioPage.vue is a single 1660-line <script setup> managing 8+ concerns simultaneously:
- Audio playback (element lifecycle, seek, volume)
- Crossfade transitions (preload, fade animation via requestAnimationFrame)
- Recovery (8 different trigger paths)
- Lyrics sync (alignment parsing, word-level highlighting, auto-scroll)
- Chat (polling, sending, scroll-to-bottom)
- Heartbeat (30s interval)
- Reactions (optimistic UI, floating emoji animation)
- Sign-in prompt (timer-based, volume reduction)
- WaveSurfer visualization
- Offline/online detection
The recovery system alone has paths triggered by: audio-error-{code}, audio-error-generic, network-restored, visibility-restored, play-failed-on-visible, watchdog-stall, unexpected-pause, and no-track. These all funnel into one recoverPlayback function, which is good, but the proliferation of entry points suggests the root causes aren't well-understood.
Severity: Medium. Makes the codebase hard to test, debug, or modify safely.
Issue: No Reaction Deduplication or Rate Limiting
The /radio/react endpoint creates a new row for every call with no throttle or uniqueness check:
async function addReaction(db, stationId, trackId, userId, reaction) {
const id = crypto.randomUUID()
await db.prepare(
`INSERT INTO radio_reactions (id, station_id, track_id, user_id, reaction) VALUES (?, ?, ?, ?, ?)`
).bind(id, stationId, trackId, userId, reaction).run()
}The same user can submit unlimited reactions for the same track. The idx_rr_user index exists on (user_id, track_id) but isn't used for dedup.
Severity: Low-medium. Could be exploited to inflate counts or fill the DB.
Issue: Genre Classifier Is Fragile
classifyGenre does naive substring matching with order-dependent first-match-wins:
export function classifyGenre(prompt: string): string {
const lower = prompt.toLowerCase()
for (const station of STATIONS) {
if (station.id === 'mixed') continue
for (const kw of station.keywords) {
if (lower.includes(kw)) return station.id
}
}
return 'mixed'
}A prompt like "dark electronic beats" would match "cinematic" (because "dark" appears in cinematic's keywords before electronic's), not "electronic." There's a trailing space in 'war ' (likely to avoid matching "warm") but this would miss "war-themed" or "war."
Severity: Low. Misclassifications put tracks in wrong stations but don't break functionality.
Issue: Chat Polling
Chat uses 5-second HTTP polling. This means up to 5s latency on messages, and every connected client makes a request every 5 seconds regardless of activity. For a radio page that users keep open for long periods, this adds up.
Severity: Low. Functional but wasteful.
What I'd Do Differently
1. Pre-computed Playlist Schedules
Run a cron job (every 5-10 min) that generates a versioned playlist schedule per station. Store it in KV as a JSON document containing: track order, cumulative start times, and a version hash.
The now-playing endpoint becomes a simple lookup: read the schedule from KV, binary-search for the current position. When the playlist changes, the old schedule stays active until the cron produces the new one, so all listeners transition together at a known boundary.
This eliminates the fundamental instability: playlist changes no longer cause position jumps for active listeners.
2. Use D1 Trim Values as Source of Truth
Backfill any tracks that have null trim columns from R2 peaks in a cron job (the pattern already exists for track names and lyrics alignment). Then remove the per-request R2 reads entirely. This cuts request latency significantly and eliminates the redundant double-computation of trim values.
3. Durable Objects for Real-Time State
One Durable Object per station, holding: current listener set, recent reactions, and chat messages. Clients connect via WebSocket instead of polling. Benefits:
- Real-time chat (no 5s delay)
- Accurate listener counts (no KV eventual consistency)
- Broadcast track changes to all listeners simultaneously
- Server can push "next track" info instead of clients discovering it independently
- Rate limiting on reactions becomes trivial (per-connection state)
4. Decompose the Client
Split RadioPage.vue into focused composables:
useRadioPlayer- audio element, seek, volume, play/pauseuseRadioCrossfade- preload logic, fade animationuseRadioRecovery- reconnection, watchdog, offline detectionuseRadioLyrics- alignment parsing, word highlighting, scroll syncuseRadioChat- messages, sending, scroll managementuseRadioPresence- heartbeat, listener list
The page component becomes orchestration only. Each composable is testable in isolation.
5. Server-Side Track Quality Gating
Before a track enters radio rotation, validate:
- Minimum playable duration > 10s
- Peak amplitude above a threshold across the track
- Trim window is reasonable (not trimming 80%+ of the track)
- Flag low-quality tracks for manual review rather than auto-adding
6. Rate-Limit Reactions
Add a UNIQUE(user_id, track_id, reaction) constraint or at minimum a time-based throttle (one reaction per user per track per type). The optimistic UI update on the client is fine, but the server should reject duplicates.
7. Embedding-Based Genre Classifier
Since the project already uses OpenAI for track names, a lightweight classification call (or even a small embedding similarity check against station descriptions) would be more accurate and maintainable than the keyword list. This could run in the same cron batch as track name generation.
Appendix: Durable Objects + WebSocket Hibernation Plan
Standard CF Workers are stateless and short-lived, so they can't hold persistent WebSocket connections. The solution is Durable Objects with the WebSocket Hibernation API. Each station gets one DO instance. The Worker routes WebSocket upgrades to the correct DO.
wrangler.toml additions
[durable_objects]
bindings = [
{ name = "RADIO_STATION", class_name = "RadioStationDO" }
]
[[migrations]]
tag = "v1"
new_classes = ["RadioStationDO"]Worker router (added to existing fetch handler)
if (path.match(/^\/radio\/ws\/[^/]+$/) && request.headers.get("Upgrade") === "websocket") {
const stationId = path.split("/").pop()!
const id = env.RADIO_STATION.idFromName(stationId)
const stub = env.RADIO_STATION.get(id)
return stub.fetch(request)
}idFromName(stationId) means each station ID (like "electronic") deterministically maps to one DO instance. All listeners for that station connect to the same object.
Durable Object class (RadioStationDO)
The key is the WebSocket Hibernation API. Without it, a DO holding 200 idle connections gets billed for wall-clock time while nothing happens. With hibernation (state.acceptWebSocket instead of manual socket handling), Cloudflare suspends the DO between events. You're billed only for:
webSocketMessageinvocations (when someone chats or reacts)alarminvocations (when tracks rotate)- Not the idle time between those events
For 200 listeners where tracks change every 3 minutes and someone chats every few seconds, you'd get a handful of DO invocations per second instead of continuous billing.
export class RadioStationDO implements DurableObject {
private state: DurableObjectState
private env: Env
constructor(state: DurableObjectState, env: Env) {
this.state = state
this.env = env
}
async fetch(request: Request): Promise<Response> {
const url = new URL(request.url)
if (request.headers.get("Upgrade") === "websocket") {
const pair = new WebSocketPair()
const [client, server] = Object.values(pair)
const userId = await this.authenticateRequest(request)
// Hibernation API: acceptWebSocket with tags for identification
this.state.acceptWebSocket(server, [userId || "anon"])
// Push current state immediately on connect
const nowPlaying = await this.getCurrentTrack()
server.send(JSON.stringify({ type: "now-playing", data: nowPlaying }))
server.send(JSON.stringify({
type: "listeners",
data: { count: this.state.getWebSockets().length }
}))
return new Response(null, { status: 101, webSocket: client })
}
// Internal calls (e.g., cron triggers track rotation)
if (url.pathname.endsWith("/rotate")) {
await this.rotateTrack()
return new Response("ok")
}
return new Response("Expected WebSocket", { status: 400 })
}
// Hibernation API handler: fires when any connected socket sends a message
async webSocketMessage(ws: WebSocket, message: string) {
const msg = JSON.parse(message as string)
switch (msg.type) {
case "chat":
this.broadcast({ type: "chat", data: msg.data })
break
case "reaction":
this.broadcast({ type: "reaction", data: msg.data })
break
case "heartbeat":
// No-op. Socket liveness is handled by hibernation ping/pong.
break
}
}
// Hibernation API handler: fires when a socket disconnects
async webSocketClose(ws: WebSocket) {
this.broadcast({
type: "listeners",
data: { count: this.state.getWebSockets().length }
})
}
// Hibernation API handler: fires at the scheduled time
async alarm() {
await this.rotateTrack()
const track = await this.getCurrentTrack()
if (track) {
const remainingMs = (track.duration - track.elapsed) * 1000
this.state.storage.setAlarm(Date.now() + remainingMs)
}
}
private broadcast(msg: object) {
const payload = JSON.stringify(msg)
for (const ws of this.state.getWebSockets()) {
try { ws.send(payload) } catch { ws.close() }
}
}
private async rotateTrack() {
const next = await this.getNextTrack()
this.broadcast({ type: "track-change", data: next })
}
}What this replaces
| Current approach | With Durable Objects |
|---|---|
Client polls /now-playing to discover track changes | DO pushes track-change to all sockets simultaneously |
| KV-based listener presence (eventual consistency, 8 reads per /stations) | state.getWebSockets().length is exact and instant |
| Chat polling every 5s via HTTP | Chat messages broadcast immediately via WebSocket |
| Heartbeat POST every 30s | WebSocket liveness is automatic (ping/pong) |
| Client recovery for missed track transitions | Server pushes transitions, client just obeys |
Track rotation via alarms
Instead of every client independently computing "what track is playing now" from wall-clock math, the DO owns track timing. When a track starts, the DO sets an alarm for when it ends:
this.state.storage.setAlarm(Date.now() + trackDurationMs)When the alarm fires, rotateTrack() advances to the next track and broadcasts the change to all listeners at once. All clients transition together, with no clock drift or modulo instability.
Migration path
This doesn't need to be all-or-nothing:
- Add the DO class and wrangler config
- Keep existing HTTP endpoints working as-is
- Client tries WebSocket first, falls back to HTTP polling if connection fails
- Migrate features incrementally: presence first (simplest), then chat, then track sync
- Once WebSocket path is stable, remove the polling fallback
Cost model
Durable Objects pricing (as of early 2026):
- Requests: $0.15 per million
- Duration: $12.50 per million GB-s (only when active, not hibernating)
- Storage: $0.20 per GB-month
For a radio feature with ~50-200 concurrent listeners per station and 8 stations, the cost would be minimal. The hibernation API means you only pay for actual message processing, not for holding idle connections. This would likely be cheaper than the current approach of every client making HTTP requests every 5-30 seconds.
Priority Order
| Priority | Change | Effort | Impact |
|---|---|---|---|
| 1 | Pre-computed playlist schedules | Medium | Fixes the fundamental sync instability |
| 2 | Use D1 trims, drop per-request R2 reads | Low | Reduces latency and cost immediately |
| 3 | Reaction dedup/rate limiting | Low | Prevents abuse |
| 4 | Decompose client into composables | Medium | Maintainability, testability |
| 5 | Durable Objects + WebSockets | High | Real-time features, accurate presence |
| 6 | Track quality gating | Low | Prevents bad tracks from disrupting playback |
| 7 | Better genre classification | Low | Improves station relevance |
Summary
The current architecture works for low-to-moderate traffic with a slowly-changing playlist, but it has a fundamental fragility: the sync model breaks on playlist changes, and the server does redundant work on every request. The client compensates for server-side gaps with complex recovery logic.
The path to long-term stability is: pre-computed schedules (fixes sync), persistent connections via Durable Objects (fixes presence/chat/reactions), cached trim data in D1 (fixes latency), and a decomposed client (fixes maintainability). The recovery logic can then be simplified dramatically because most of the failure modes it handles today stem from the server and client disagreeing about state.