Building an open world in the browser, part 22: Clouds you can light, and culling that has to be fed
By Oleg Sidorkin, CTO and Co-Founder of Cinevva
New here? Use the series guide. It explains what a spike is and links all parts.
Part 21 was about a rendering technique that didn't pay off. This part has one that did, and one that needed a careful fix to work at all. Spike 43 is the sky: a physically based atmosphere and volumetric clouds, the foundation that makes a scene read as a place instead of a tech demo. Spike 44 is meshlet-style GPU culling, where the lesson was that the occlusion test is only as good as the occluders you feed it.
A sky from physics, not a gradient
Open Spike 43 in a new tab ↗ · View source
Almost every cinematic weather effect depends on two pieces of infrastructure: a physically based atmosphere, so the sky color and sun color follow time of day from physics rather than a hand-tuned gradient, and a volumetric cloud volume, so the sky has 3D structure instead of a baked cubemap. Spike 43 builds exactly that pair on the existing WebGPU and TSL stack, nothing more, because once those two exist the rest of the weather stack (fog, god rays, wet surfaces, snow) becomes a series of smaller known follow-ons.
The atmosphere is the Hillaire 2020 model, a set of lookup tables computed in WGSL compute shaders. A transmittance table integrates sun light through the Rayleigh, Mie, and ozone density profile and recomputes only when the sun moves. A sky-view table re-bakes every frame because it's cheap enough that gating it isn't worth the code, with a non-linear parameterization around the horizon to avoid banding. Multiple scattering uses an analytic fit instead of the proper table for now, and the sunsets read correctly, so the shortcut is hiding well. The clouds are a Schneider Nubis-style ray-march through a horizontal slab, with shape from a 128³ Perlin-Worley texture eroded by a 32³ Worley texture, both baked at boot in compute with no network fetch, lit with Beer's law extinction, a dual-lobe phase function, and a powder approximation. The key coupling is that the cloud's sun color samples the same transmittance table every step, so cloud lighting tracks sunset without a second tuning pass.
On an M1 the whole thing lands at 1.1 to 2.0 ms at half-resolution clouds, well under the 6 ms budget, using about 14 MB of GPU memory, running at over 100 FPS. The two thesis claims held up in practice. Sunset is the hero shot, the moment that makes the renderer feel cinematic, and it falls out of the physics without per-time-of-day tuning. And a happy accident: with the cloud slab parameterized between 800 m and 4000 m, distant clouds at the horizon read as dark mountain ridges from a low camera, which gives the world background terrain without anyone modeling background terrain.
One architecture note worth keeping. The natural shape is to paint the sky into the swap chain first, then let three.js draw geometry on top with autoClear = false. That does not survive the WebGPU renderer on r184, because the flag doesn't gate the color load op the way it does in WebGL, so three.js clobbers the sky every frame. The fix is to render three.js into an offscreen target and do the final composite (mix(skyCloud, scene, scene.alpha) then ACES then sRGB) in our own pass that owns the swap chain.
Culling that's only as good as its occluders
Open Spike 44 in a new tab ↗ · View source
Spike 44 benches four rendering modes against each other: plain forward, CPU cluster culling, GPU compute culling, and a visibility buffer with Hi-Z occlusion culling. The Hi-Z path is the interesting one, and it had a quiet bug: its HUD said occlusion was on, but the "Hi-Z killed" counter sat at exactly 0.0% forever. Frustum culling worked, so the cull shader's upstream was fine. The occlusion half was a no-op paying full cost.
A Hi-Z occlusion test projects a cluster's bounding box to screen, picks a depth-pyramid mip level so the screen rectangle is about 2×2 texels, samples the deepest occluder depth in that rectangle, and rejects the cluster if its closest point is still farther than that occluder. The depth pyramid is built every frame by seeding mip 0 from an opaque-depth pre-pass and max-reducing up. The opaque pre-pass deliberately includes only solid occluders, the ground and per-tree trunk proxies, because alpha-tested foliage would punch gaps that fool a max-reduce.
The bug was geometric, not logical. The trunk proxy was a 0.5 m by 4 m by 0.5 m box. At 30 m it projects to about 17 pixels on screen. But a typical 50 m grass cluster selects mip 5, where each texel covers 32 source pixels. A 17-pixel trunk doesn't fully cover a single mip-5 texel, so every texel touching the trunk also touches surrounding ground. The very first 2×2 max-reduce picks the larger depth value, which is the farther ground behind the trunk, and the trunk's depth gets erased on the first reduction. By mip 5 the pyramid holds ground depth almost everywhere, the cluster is never farther than the ground, and nothing is ever occluded.
The fix is to make the proxy big enough to dominate the texels it lands in, sizing it to the tree's silhouette rather than its wood. A roughly 2 m by 6 m by 2 m proxy is still smaller than the actual canopy, so leaves visible through gaps never get over-culled, but it's large enough to survive the max-reduce out to the distances that matter, and the occlusion counter immediately went non-zero. The takeaway generalizes to a rule for the production engine: anything trusted as a Hi-Z occluder has to be sized to its on-screen silhouette, because Hi-Z effectiveness in open foliage scenes is dominated by occluder coverage at the relevant mip, not by the elegance of the depth-test math. Grass-versus-grass occlusion can't fire anyway, since a blade sits at the same depth as the ground under it, so the real wins are trees occluding distant foliage and trees occluding trees.
Technology referenced in this chapter
Hillaire 2020 atmosphere LUTs. A transmittance table (sun light through the Rayleigh, Mie, and ozone profile) recomputed only on sun movement, plus a per-frame sky-view table with a non-linear horizon parameterization, give physically based sky and sun colors that track time of day with no hand-tuned gradient. An analytic multiple-scattering fit substitutes for the full table until an artifact forces the proper bake. Sunset falls out of the physics without per-time-of-day tuning.
Schneider Nubis volumetric clouds. A ray-march through a horizontal slab, shaped by a boot-baked 128³ Perlin-Worley texture eroded by a 32³ Worley texture, lit with Beer's law extinction, a dual-lobe phase, and a powder term. Sampling cloud sun color from the same transmittance table every step makes cloud lighting track sunrise and sunset for free. Half-resolution ray-march runs roughly 4× cheaper than full with no visible quality loss at typical distance, the standard production trade.
Compositing raw WebGPU with three.js on r184. Painting the sky into the swap chain and drawing three.js geometry over it with autoClear = false fails because the flag doesn't gate the color load op in the WebGPU backend. Render three.js into an offscreen RGBA16F target and do the final mix plus tone-map plus sRGB in a pass that owns the swap chain.
Hi-Z occlusion culling and occluder sizing. A depth pyramid built by max-reduce lets a GPU cull pass reject clusters whose nearest point is behind the deepest occluder in their screen rectangle. The test silently does nothing if occluders are too small to dominate a texel at the selected mip, because the first max-reduce replaces the occluder's depth with the farther background behind it. Occluders must be sized to their on-screen silhouette, not their physical core. See GPU-driven LOD.
Part 22 of 29. Previous: Part 21 - A faster renderer that wasn't faster Next: Part 23 - Fifty avatars and a voice in the room Series guide: /blog/2026-02-25-open-world-browser-series-guide