Building an open world in the browser, part 4: Streaming before fancy terrain
By Oleg Sidorkin, CTO and Co-Founder of Cinevva
New here? Use the series guide. It explains what a spike is and links all parts.
Streaming is where "looks good" projects usually fall apart.
You can hide a lot in a still frame. You cannot hide a 40 ms hitch while crossing a chunk boundary.
We intentionally tested streaming before we built advanced terrain representation. That gave us clean signal on load and unload behavior.
Spike 6 validated neighborhood churn with simple chunk content.
Open Spike 6 in a new tab ↗ · View source
Then we moved to the real terrain path in Spike 11. Height chunk streaming with worker side decode and progressive refinement from 17 to 33 to 65 sample grids.
Open Spike 11 in a new tab ↗ · View source
The sequencing mattered more than we expected. If we had started directly with compressed height chunks, every hitch would have been ambiguous. Decode issue, texture upload issue, or geometry update issue. Spike 6 removed one uncertainty layer before Spike 11 added complexity.
A practical lesson from this chapter carried into later spikes. Upload stalls must be measured directly, not inferred from average FPS. Average FPS hides frame spikes, and frame spikes are what users actually feel.
In part 5 we move into the visual cost chapter where vegetation, terrain shaders, and cascaded shadows compete for the same frame budget.
Technology referenced in this chapter
Chunk-based streaming. The world is divided into a grid of independent chunks (typically 64x64 meters). As the player moves, chunks on the trailing edge unload while chunks on the leading edge stream in. This is how Skyrim's cell system works: a 5x5 grid of cells loaded around the player, swapping as they move. The browser version adds network latency to the equation, making predictive pre-fetching based on player velocity critical. See our streaming architecture guide.
Progressive heightmap refinement. Send terrain at low resolution first, then refine. A 17x17 grid (the minimum for a 64m chunk at 4m spacing) is ~200 bytes compressed and renders a visible surface instantly. Then stream the 33x33 refinement (adds detail), then the full 65x65 resolution. Each level adds samples without replacing previous data. This maps directly to geometry clipmap LOD rings where distant terrain uses low-resolution data and close-up terrain uses full resolution. See progressive chunk loading.
Delta encoding and compression. Heightmap data compresses well because adjacent cells have similar values. Delta encoding stores the difference between each cell and its predicted value (average of neighbors), clustering values near zero. Combined with zlib or brotli, a 65x65 chunk drops from 8.4 KB raw to 1-2 KB compressed. At reduced precision for distant chunks (8-bit instead of 16-bit): 0.5-1 KB. See terrain data compression.
Predictive pre-fetching. Loading chunks before the player arrives. At walking speed (5 km/h), pre-fetch 2 chunks ahead (128m). At running speed (15 km/h), pre-fetch 4 chunks. The load ring shifts with velocity direction. A priority queue sorts pending requests by urgency and cancels requests for chunks the player has moved away from. See predictive pre-fetching.
Part 4 of 12.
Previous: Part 3 - The unflashy spikes that saved us
Next: Part 5 - Budgeting the pretty stuff
Series guide: /blog/2026-02-25-open-world-browser-series-guide