a voxel engine, part 6: shaping memory access

The renderer spends most of its time reading the map. Not computing, not drawing, but fetching data. On this platform, RAM access is slow enough that layout matters more than arithmetic.

A simple 256×256 map already shows this. Access in one direction is much faster than in the other. Sequential reads follow the layout and stay in cache. The other direction effectively jumps with a stride of 256 bytes. The difference is large, close to a factor of two.

Combining height and color into a single 16-bit value improves this. Both are fetched in one access, which reduces bandwidth pressure. The downside is a larger stride. Interleaving now happens in 512-byte steps. This still works, but it does not scale well.

Larger maps make the problem worse. A naïve 1024×1024 layout would increase the stride further and degrade performance quickly.

Instead, the map is split into fixed-size tiles. Each tile is 256×256, stored contiguously. A larger world is then a grid of these tiles. In practice, a 1024×1024 map becomes a 4×4 grid of 256×256 chunks.

Access within a tile remains cache-friendly. Crossing tile boundaries happens rarely and only requires updating a pointer, not changing the access pattern itself. Larger worlds behave almost like the small case.

The structure looks like this:

Fetching a value then becomes a two-step process: select the tile, then index inside it.

This removes most of the penalty for larger maps. The renderer stays fast, but access is still direction-dependent.

To address that, a second copy of the map is kept in memory, rotated by 90 degrees. It is generated once at startup. Depending on the viewing direction, the renderer switches between the two layouts.

This avoids the worst-case access pattern entirely. One direction becomes fast, the other remains fast as well.

The cost is additional memory and slightly more work when the map changes. Updates, such as water propagation, have to be written to both copies. In practice this is cheap, since it only requires swapping x and y indices.

The result is a layout that matches the access pattern instead of fighting it. The renderer does not try to be cache-friendly by accident. It is designed that way.

back home

Proxied content from gemini://geopeedees.org/~enthusi/blog/voxel_2026-04-13.gmi (external content)

Gemini request details:

Original URL
gemini://geopeedees.org/~enthusi/blog/voxel_2026-04-13.gmi
Status code
Success
Meta
text/gemini
Proxied by
kineto

Be advised that no attempt was made to verify the remote SSL certificate.