Big renderer improvements: Correct and automatic blending sorting, more powerful batching (now cross-scene), easier and more reliable occlusion culling and occlusion sorting

Posted on

Blending in 2D from "The Unholy Society"
Blending in 2D from "The Unholy Society"
Blending in 2D from "The Unholy Society"
Blending in 2D from "Escape from the Universe"
Blending in 2D from "Escape from the Universe"
Blending in 2D from "Escape from the Universe"
FPS game example - solid wireframe
FPS game example
Customized color channels
isometric_game with batching
Solid wireframe

It’s a pleasure to finally close a big TODO from our roadmap. This refactor took 122 commits, and it reworks a core rendering code in the engine. This was hard to do but also was a lot of fun — I used this opportunity to do a lot of code simplifications. These simplifications also translate to some performance gains.

New features / improvements

Blending sorting is now much better:

  • It can be controlled using simple TCastleViewport.BlendingSort.

  • The default is sortAuto, which auto-detects and uses best blending sorting for 2D or 3D (depending on camera parameters). For many applications, you should no longer even need to touch this, blending should “just work”. This was tested also with our bigger games The Unholy Society and Escape from the Universe.

  • The new algorithm sorts all shapes, from all scenes. So it can account for transparent objects transformed in any way (using any hierarchy of TCastleTransform) and even for cases when multiple scenes’ shapes may result in mixed order. E.g. some transparent objects on the big (level) TCastleScene may be in front, and some behind, transparent objects of some creature TCastleTransform.

  • To address non-trivial cases we have new TCastleViewport.OnCustomShapeSort event. You can define sorting function, using any criteria you want. See TShapeSortEvent for details and example event implementation.

Batching is now more powerful and easier to use:

  • Batching can be activated by TCastleViewport.DynamicBatching, and thus configurable from CGE editor too.

    Remember you can invoke in CGE editor “Edit -> Show Statistics” (F8) to see the rendering statistics. They will reflect batching — the number of rendered shapes will drop.

  • Batching now works cross-scenes. That is, shapes from one scene can be merged with shapes from another, unrelated scene, if their de-facto rendering settings (material, texture) match.

    Also more cases are now allowed for batching. In particular, batching now works for shapes (even from different scenes) with the same image loaded, and for TCastleImageTransform with the same image loaded — these cases, while seem trivial, are important when you design 2D maps.

    This makes batching much more universal, it works in more situations.

    In particular, designing your world using a big number of TCastleImageTransform instances is now reasonable — 100×100 TCastleImageTransform instances will not cause 100*100 draw calls, they will be batched into as few draw calls as possible (maybe even one, if you just use one image for all tiles). Example: examples/isometric_game.

  • The global variable DynamicBatching is deprecated. Prefer to use TCastleViewport.DynamicBatching.

Occlusion culling is now easier and more powerful:


  • New option TCastleRenderOptions.InternalColorChannels exists to limit which RGBA channels are written by rendering given model. Allows for some cool graphic tricks.

    Also it was necessary to make tricks that require rendering something only to the depth buffer, without touching the color buffer. See the fixed_camera_game example.

    For now this property is “internal” — I’m unsure how much it will be actually useful, and we want to reserve the right to remove it if it will become too unconvenient to maintain. Admittedly it may work a little weird — it doesn’t write some RGBA channels, but it still writes the depth values, so it’s a little hard to control what is visible behind it. If you find it useful anyway, please let me know about it 🙂

  • OpenGL resources that are associated with X3D nodes are now stored in much more straightforward way, avoiding lookups using any dictionaries, and avoiding reinitialization when you move scene from one viewport to another. This makes accessing texture resources and screen effects’ shaders a simple and instant operation, instead of previous inefficient search.

Backward compatibility

This is a compatibility-breaking change, no way around it. The old sorting method, that was only sorting inside a scene, was inherently incorrect. Trying to maintain it for backward compatibility would make code really complicated, and API too. So you just have to adjust — set the TCastleViewport.BlendingSort if needed.

As always, if you have questions about upgrading, ask us!

What we did to address backward compatibility:

  • From editor: We make a warning if your scenes use non-standard blending sort on import.

    We make a warning also if you used old Viewport.Items.BlendingSort.

    Judge whether the new version works, adjust TCastleViewport.BlendingSort if needed, and save the design to remove the warning.

  • From code: RenderOptions.BlendingSort is just removed. Deliberately, there’s no way to sort only within scene now. We always sort within viewport.

    Same for Viewport.Items.BlendingSort. Same for TCastleScene.Setup2D.

    The compilation should break, forcing you to upgrade to new the new TCastleViewport.BlendingSort. It takes different arguments (sortXxx enums) and works much better. As explained above, you likely don’t need to set it at all. In my experiments, in almost all cases I just removed the calls to removed methods / properties and things are now automatically good.

  • Default BlendingSort = sortAuto automatically adjusts to your camera.

    Setting typical 2D camera, by TCastleViewport.Setup2D or any other method, will activate 2D blending sorting.

    I’ve done a few iterations of this — I considered Setup2D not doing anything, or having obligatory AdjustBlendingSort (without default=true) or having a set like [seBlendingSort2D] or making Setup2D just set BlendingSort:=sort2D. Ultimately in all usages of TCastleViewport.Setup2D, in CGE and our games it was obvious that using new BlendingSort:=sort2D is a good idea.

    Eventually BlendingSort:=sortAuto won everything. It fills all use-cases nicely and easily. It is also naturally the default TCastleViewport.BlendingSort property value.

Special considerations for custom rendering

By “custom rendering” I mean here things that override TCastleTransform.LocalRender to issue direct OpenGL(ES) commands. For example Kagamma’s (Trung Le) projects:

If you perform custom rendering, without using TCastleScene nodes, note that things are a bit different now. TCastleScene.LocalRender only schedules shapes to be rendered later. So your custom rendering will always happen before the CGE TCastleScene rendering, so objects with custom rendering are assumes to be behind for blending. This shouldn’t really matter for rendering non-blending, but it may matter for rendering with blending.

I have a plan to play with it more, to make custom renderers more flexible. In the future, everything, including custom rendering, should make one “collect stuff for rendering” phase, and then all collected things should be executed. Right now, we do more “collect rendering” phases (once for opaque, once for blending) to also avoid breaking too many things for custom rendering algorithms. But it will change — so an optimization on top of existing work, and improvement for custom renderers, is coming.

Support our work

This was hard, and fun, work 🙂 Like it? Please support us on Patreon!

Start the discussion at Castle Game Engine Forum