Skip to content

Radio track switching and Safari recovery review

Date: 2026-02-21 Scope: .vitepress/components/RadioPage.vue, .vitepress/composables/useRadioWebSocket.ts, worker/src/radio.ts, worker/src/radio-station-do.ts, worker/src/index.ts

1) Current architecture in production code

1.1 Track switching has two server authorities

Path A is HTTP now playing in worker/src/radio.ts. It computes track position from wall clock modulo playlist duration on every request. It also fetches trim values from R2 and can force a boundary advance path near track end (handleRadioNowPlaying, around L963-L1066).

Path B is Durable Object websocket in worker/src/radio-station-do.ts. It builds a schedule in memory, computes current track from the same wall clock modulo model, and emits track-change on alarm (ensureSchedule and alarm, around L350-L401 and L332-L346).

Both paths are active in the router (worker/src/index.ts, around L9372-L9427).

1.2 Client switching has several local authorities

RadioPage.vue can switch tracks in at least four ways.

  1. Local predictive crossfade near end (startCrossfade, onTimeUpdate, around L872-L921 and L1190-L1195).
  2. Local onTrackEnded transition to nextTrack (onTrackEnded, around L1203-L1266).
  3. Websocket track-change event handler (connectWebSocket, around L1888-L1922).
  4. Recovery reload via HTTP loadNowPlaying (recoverPlayback, around L1732-L1792).

This means switching can happen from timer prediction, media events, websocket push, and recovery pull.

1.3 Safari protection uses an extra keepalive audio graph

RadioPage.vue creates a separate AudioContext with a low gain oscillator (startKeepAlive, around L468-L486). This context is not the same pipeline as the <audio> element playback. Resume logic then tries to recover context state plus media element state through visibility, pageshow, click, keydown, touchstart, and watchdog triggers (resumeAudioContext, onVisibilityChange, onPageShow, checkPlaybackHealth, around L1562-L1728).

2) Where the band-aids show up

2.1 Dual authority drift

The player starts with HTTP now playing then subscribes to websocket events. Recovery still falls back to HTTP and can overwrite websocket state (switchStation, loadNowPlaying, recoverPlayback, around L708-L743, L747-L788, L1732-L1792). If these paths disagree on timing or trims, client behavior jumps.

2.2 Extra transition logic for edge boundaries

Server now playing includes a boundary buffer path to skip near-ended tracks (worker/src/radio.ts, around L1019-L1048). Client also has local near-end crossfade and local end handling. These are separate patches around the same root issue which is no single transition owner.

2.3 Recovery trigger explosion

Recovery enters from network events, visibility, pageshow, audio error, waiting or stalled, watchdog stall, unexpected pause, and no track. All feed one function, which is good, but the number of entry points is a signal that state ownership is unclear (RadioPage.vue, around L1296-L1792).

2.4 Mixed real time and polling

The page uses websocket plus heartbeat polling and chat polling at the same time (startHeartbeat, startChatPolling, around L1395-L1436). Listener and chat state can be updated by different channels.

3) Root causes, not symptoms

  1. Track switching ownership is split across server pull, server push, and client prediction.
  2. Playback pipeline ownership is split across HTML media elements, WaveSurfer internals, and a separate keepalive context.
  3. Recovery policy is event driven but not state machine driven.

4) KISS target

4.1 One authority for timeline

Use websocket station DO as the source of truth for trackId, startedAt, trimStart, and nextTrack. Keep HTTP now playing only as a fallback for cold start or websocket failure. Do not let recovery overwrite active websocket state unless websocket is down.

4.2 One audio engine in the client

Create one useRadioAudioEngine composable with a single explicit state machine:

idle -> primed -> playing -> recovering -> playing

All playback and recovery transitions go through this machine.

4.3 One always-live AudioContext

Your idea is valid and it removes hidden browser behavior from the equation. The key detail is implementation mode.

Mode A uses no <audio> elements and plays AudioBufferSourceNode objects only. This gives full control and one graph for track audio and oscillator keepalive. It also means full fetch and decode before playback, more memory use, and a more complex buffering model.

Mode B keeps <audio> elements as decoders but routes them through one AudioContext graph with MediaElementAudioSourceNode and gain nodes. This is less control than Mode A but much less risk.

If KISS is the top priority now, Mode B is the safer first step. If full deterministic control is the top priority, Mode A is the end state.

5) Practical migration plan

Step 1: Remove dual switching ownership. Disable client predictive crossfade and local autonomous switching first. Apply server pushed track-change as the default path.

Step 2: Move Safari keepalive into the same engine context. Drop the separate keepalive context and all context introspection through WaveSurfer internals.

Step 3: Reduce recovery to three causes only. Cause 1 is websocket disconnected. Cause 2 is context suspended. Cause 3 is media stalled beyond threshold.

Step 4: Keep one reconnect loop in websocket composable. Remove parallel retry loops from player page.

Step 5: Once stable, decide Mode A or Mode B for the audio graph. If Mode A, add explicit prefetch and decode queue with bounded memory and backpressure.

6) Suggested immediate cuts to complexity

  1. In RadioPage.vue, stop calling syncNextTrack after local transitions. Let websocket updates own next track data.
  2. Stop heartbeat and chat polling when websocket is connected.
  3. Handle websocket disconnected and error events explicitly in the page state machine.
  4. Replace many recovery reason strings with a typed enum and a single deduplicated trigger queue.

7) Decision summary

The current code works, but it is carrying many local fixes because switching and recovery have multiple owners. The clean path is one timeline owner and one client audio engine owner. Your proposal of one always-live AudioContext is directionally correct. The only decision left is whether to go straight to buffer sources or take the lower risk hybrid mode first.