impr: Radiance Cascades v2#2617
Conversation
… the new system, add cool factor
# Conflicts: # apps/typegpu-docs/package.json # apps/typegpu-docs/src/examples/rendering/radiance-cascades-drawing/drawInteraction.ts # apps/typegpu-docs/src/examples/rendering/radiance-cascades-drawing/index.ts # apps/typegpu-docs/src/examples/rendering/radiance-cascades/drag-controller.ts # apps/typegpu-docs/src/examples/rendering/radiance-cascades/index.ts # apps/typegpu-docs/tests/individual-example-tests/jump-flood-distance.test.ts # packages/typegpu-radiance-cascades/README.md # packages/typegpu-radiance-cascades/package.json # packages/typegpu-radiance-cascades/src/cascades.ts # packages/typegpu-radiance-cascades/src/index.ts # packages/typegpu-radiance-cascades/src/runner.ts # packages/typegpu-sdf/src/jumpFlood.ts # packages/typegpu/src/tgsl/accessProp.ts # pnpm-lock.yaml
|
pkg.pr.new packages benchmark commit |
📊 Bundle Size Comparison
👀 Notable resultsStatic test results:No major changes. Dynamic test results:
📋 All resultsClick to reveal the results table (356 entries).
If you wish to run a comparison for other, slower bundlers, run the 'Tree-shake test' from the GitHub Actions menu. |
Resolution Time Benchmark---
config:
themeVariables:
xyChart:
plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
title "Random Branching (🔴 PR | 🔵 main | 🟢 release)"
x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
y-axis "time (ms)"
line [0.79, 1.61, 3.75, 5.08, 6.47, 9.99, 19.08, 22.70]
line [0.87, 1.69, 3.67, 5.14, 6.13, 10.27, 18.78, 21.20]
line [0.84, 1.71, 3.79, 5.32, 6.28, 9.68, 18.02, 19.83]
---
config:
themeVariables:
xyChart:
plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
title "Linear Recursion (🔴 PR | 🔵 main | 🟢 release)"
x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
y-axis "time (ms)"
line [0.27, 0.49, 0.63, 0.75, 1.00, 1.08, 1.36, 1.43]
line [0.28, 0.51, 0.59, 0.77, 1.01, 1.03, 1.32, 1.48]
line [0.31, 0.49, 0.59, 0.78, 1.08, 1.05, 1.34, 1.39]
---
config:
themeVariables:
xyChart:
plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
title "Full Tree (🔴 PR | 🔵 main | 🟢 release)"
x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
y-axis "time (ms)"
line [0.91, 1.78, 3.70, 5.97, 10.40, 21.98, 48.26, 97.49]
line [0.84, 1.92, 3.85, 5.71, 10.53, 22.33, 47.24, 97.52]
line [0.75, 1.82, 3.58, 5.43, 10.58, 22.43, 48.85, 100.95]
|
|
@pullfrog pls review |
There was a problem hiding this comment.
✅ No new issues found.
Reviewed changes — v2 rewrite of the radiance cascades package with better cascade sizing, Morton ray ordering, configurable merge modes, encoder support, and memory optimization.
- Add
getCascadeInfowith per-layer metadata — replaces the flatgetCascadeDimwith a richCascadeInfostructure carrying per-layer probe counts, UV ranges, and valid dimensions. The oldgetCascadeDimis kept as a convenience wrapper. - Morton Z-order ray ordering — replaces linear ray indexing with Morton encoding for more uniform angular sampling.
part1By1andmorton2Dare defined incascades.tsand used inmakeCascadePassCompute. - Aspect-correct ray directions —
rayDirectionandsegmentMetricLengthuserenderAspectSlotto produce correct ray distributions on non-square render targets.rayBoxExitUvclips rays to the unit box. - Configurable cascade layers — support for
baseStoredRayDim(1/2/4),mergeMode(hardwareorbilinear-fix), per-layer uniform buffers (CascadeLayerParams), and exposure of common slots (maxRayStepsSlot,rayMarchStepSafetySlot,traceSegmentSlot). - Encoder support —
run()accepts an optionalGPUCommandEncoderso cascade and jump-flood passes can be batched with surrounding work. When omitted,run()creates and submits its own encoder. - 2D ping-pong cascade textures by default — reduces memory usage significantly. 3D array textures are used only when
keepCascadeLayers: true. - Prebuilt JFA pipelines —
createJumpFloodnow prebuilds one pipeline per JFA step instead of writingoffsetUniformat runtime. Fixesdestroy()to clean up offset buffers and uniforms.
DeepSeek Pro (free via Pullfrog for OSS) | 𝕏
|
@pullfrog could you estimate how much less gpu memory we allocate for a couple given resolutions? |
|
The biggest memory win comes from the cascade ping-pong textures switching from 3D arrays to 2D (the default Cascade texture memory: old vs new (default
|
| Resolution | Cascade Dim | Old Layers | Old Memory (2×3D) | New Memory (2×2D) | Savings |
|---|---|---|---|---|---|
| 4K (2160p) | 8192×4608 | 8 | 4.50 GB | 576 MB | 3.94 GB |
| 1440p | 4096×2304 | 7 | 1008 MB | 144 MB | 864 MB |
| 1080p | 4096×2304 | 7 | 1008 MB | 144 MB | 864 MB |
| 720p | 2048×1152 | 7→6 | 252 MB | 36 MB | 216 MB |
| 1024×768 | 2048×1536 | 7→6 | 336 MB | 48 MB | 288 MB |
| 800×600 | 1024×768 | 6 | 72 MB | 12 MB | 60 MB |
(1080p and 1440p share the same cascade dim because the closest-power-of-two diagonal rounds to the same value.)
Tuning baseStoredRayDim (new path, 2×2D only)
Lower values reduce cascade dimensions further at the cost of ray angular resolution:
| Resolution | baseStoredRayDim=1 |
baseStoredRayDim=2 (default) |
baseStoredRayDim=4 |
|---|---|---|---|
| 4K | 144 MB | 576 MB | 2.25 GB |
| 1080p / 1440p | 36 MB | 144 MB | 576 MB |
| 720p | 9 MB | 36 MB | 144 MB |
Uniform buffers
A negligible per-layer uniform buffer (CascadeLayerParams, ~48 B each) replaces the old single CascadeStaticParams + layerBuffer — about the same total, just split per-layer so no layerBuffer.write() is needed during dispatch.
Task list (4/4 completed)
- Explore codebase to understand GPU memory allocation patterns
- Find resolution-dependent buffer/texture allocations
- Calculate and estimate memory savings for given resolutions
- Report findings via pullfrog_report_progress
DeepSeek Pro (free via Pullfrog for OSS) | 𝕏

Uh oh!
There was an error while loading. Please reload this page.