1. Introduction

You can use skinned animation when designing models in Castle Game Engine. This means that your 3D model defines a number of joints (aka "bones") that can be transformed (e.g. to make a "walk" animation) and cause a mesh (aka "skin") to animate accordingly, following the joints.

In most cases, you don’t need to do anything special on the engine side to use it. Simply design models using Blender with Blender’s armature and bones, export it to glTF, and load it in Castle Game Engine as usual. A 3D model in Castle Game Engine is displayed and processed using the TCastleScene component, documented in Viewport with scenes, camera, navigation.

The engine will automatically load the skinned animation from glTF, and display it correctly. We can perform the animation on the GPU, which means it’s efficient. We also have a fallback (for ancient GPUs) non-GPU implementation.

2. Skin node under the hood

Each TCastleScene is actually a graph of X3D nodes (rooted in TCastleSceneCore.RootNode). The nodes control everything — display, animation of the model.

The skinned animation is realized using the TSkinNode, described on this page. It’s connected to the skinWeights0 and skinJoints0 fields of most geometry nodes.

Understanding how it works allows you to design or control the animation of characters yourself, from Pascal code. For example, you can load your skinned animated characters from your custom 3D model formats, you can implement inverse kinematics or do any other direct manipulation of joints from code.

The documentation below accounts for two ways how you can build X3D nodes:

You can just write X3D content in a text file, with extension .x3dv ("classic" syntax) or .x3d (XML syntax). The node definition then looks like Skin { … }. See example below for more.
You can build the X3D nodes graph in Pascal code. In this case, you create an instance of the TSkinNode class. The properties of this class map one-to-one to the Skin node from X3D.

3. Overview

We define a new node called Skin (Pascal: TSkinNode). It allows to design a skinned animation (using skeleton with joints, and shapes with skin weights).

It has the following properties (click on each property for more detailed API docs):

joints

When the joints are transformed (moved, rotated, scaled), the skin (meshes listed in Skin.shapes) is updated accordingly.

Allowed values: list of Transform or HAnimJoint nodes.
inverseBindMatrices

For each joint, an "inverse bind matrix" may be specified, which transforms the mesh into the local space of the joint.

Allowed values: list of 4x4 matrices.
shapes

Shapes whose geometries are affected by the skinned animation, that is: their vertexes move to follow the joints that are associated with them.

Allowed values: list of TAbstractShapeNode nodes.
skeleton

Common root of the joints hierarchy.

Allowed values: one X3DGroupingNode node.

This node defines how a set of joints influence a mesh, thus enabling skinned animation in a way that is simple, efficient and perfectly aligned with glTF. This node directly corresponds to a single glTF skin in a glTF file.

The idea is that you have:

A skeleton, which is a hierarchy of joints (aka "bones"), expressed in X3D as a hierarchy of TTransformNode (or any other X3D node that defines transformation, like THAnimJointNode).
A number of meshes, which are a number of X3D geometry nodes using TAbstractCoordinateNode with per-vertex data in skinWeights0 and skinJoints0.

Each vertex of a mesh is affected by a small subset of joints. For each joint on each vertex, a weight determines how much each joint affects this vertex. You can change joints (change their transformations, e.g. by regular X3D animation using TTimeSensorNode and interpolators, or just directly access the joint node from Pascal and change its translation / rotation / scale) and in effect the mesh should change.

The Skin node defines the connection between the joints and the skin.

Placement of the Skin node within the X3D nodes graph matters. Skin is a descendant of the X3DChildNode (TAbstractChildNode) in Pascal) node. Placing it under a specific parent transformation makes the resulting joints (Skin.joints) and skin (Skin.shapes) be part of the X3D transformation hierarchy, so they are rendered with the designated transformation.

Note	If you’re familiar with the X3D specification syntax for node definitions, you can find `Skin` node spec expressed in this way here.

Note	The `Skin` node itself is not an "animation" that you can play. The actual animations are defined by using `TimeSensor` nodes that cause interpolators to transform (move, rotate, scale) the joint nodes. Animating joints will in turn transform the meshes as indicated by this `Skin` node.

Note	There are some differences, but there are also some similarities between `Skin` node and `HAnimHumanoid` node. They both can be used for skinned animation. If you’re familiar with H-Anim and `HAnimHumanoid` node, we outline the differences (and similarities) in the section below.

4. Additional per-vertex information at geometry nodes

We also enhance geometry nodes with the necessary per-vertex information for each vertex (which joints affect it, how much):

skinJoints0 (from Pascal: TAbstractComposedGeometryNode.SetSkinJoints0): for each vertex, 4 most important joints that affect it.
skinWeights0 (from Pascal: TAbstractComposedGeometryNode.SetSkinWeights0): for each vertex, how much is it affected by the 4 most important joints.

This information is a per-vertex data useful for animation systems, whether implemented by CPU or GPU.

Note	In the future we may introduce fields like `skinJoints1`, `skinWeights1` etc. to support more than 4 joints per vertex. This is consistent with glTF data `JOINTS_0`, `WEIGHTS_0`, `JOINTS_1`, `WEIGHTS_1` etc.

5. X3D Example

Example Skin definition is as follows:

Note	Fully working version of this example is in demo-models, animation/skinned_animation.x3dv. You can open it using any engine tool, e.g. Castle Model Viewer.

Skin {
    # Shapes that are affected by the skinned animation.
    # They are displayed as part of displaying the Skin node.
    # Note that this list can only contain Shape nodes, not more
    # complicated compositions like transformations of them.
    shapes [
        DEF SkinnedMeshShape1 Shape {
            geometry IndexedFaceSet {
                ...
                skinWeights0 ...
                skinJoints0  ...
            }
        }
        DEF SkinnedMeshShape2 Shape {
            geometry IndexedFaceSet {
                ...
                skinWeights0 ...
                skinJoints0  ...
            }
        }
    ]

    # Joints hierarchy, starting from root node.
    #
    # Note: The hierarchy below is *just a trivial example*,
    # not a proper example of how joints for a typical humanoid
    # look like. Follow H-Anim conventions for joints to define
    # typical humanoid joints hierarchy.
    #
    # Any X3D node graph is allowed here, in particular you can
    # also specify Shape nodes within the transformations here.
    # Such Shape nodes will be just rigid 3D shapes attached to
    # the joints (like a sword may be attached to the avatar's hand).
    #
    # The Shape(s) that should be affected by the skin mechanism
    # (modified by joints) should *not* be listed here, they should
    # be placed only in the Skin.shapes field.
    skeleton DEF RootJoint Transform {
        children [
            DEF Body Transform {
                children [
                    DEF LeftThigh Transform {
                        children [
                            DEF LeftShin Transform {
                                children [
                                    DEF LeftFoot Transform {
                                    }
                                ]
                            }
                        ]
                    }
                    DEF RightThigh Transform {
                        children [
                            DEF RightShin Transform {
                                children [
                                    DEF RightFoot Transform {
                                    }
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    }

    # List of joints.
    # This is a list of all the joints that control the shapes
    # in the "shapes" list.
    #
    # If a human writes the X3D content,
    # we suggest placing all the joints in "skeleton" hierarchy,
    # with names (DEF). Then in the "joints" field just list them with "USE".
    # This is easier to read and write for humans.
    #
    # That said, following X3D, it is possible to write using any order,
    # as long as each USE is after a corresponding DEF.
    # So you can also start by writing
    # joints in "joints" list, only interlinking to previous joints by USE,
    # and then define the "skeleton" hierarchy by a USE to the root joint.
    # For software, this is equally easy to write and later read, so not a problem.
    # For humans, it will likely look more complicated.
    #
    # The order of joints here matters:
    # joint indexes specified in the skinJoints0 array
    # refer to the position of joint on this list.
    #
    # All the joints listed here must also be part of the "skeleton"
    # hierarchy.
    # However, not *all* transformation nodes (Transform) from the "skeleton"
    # need to be listed here and considered "joints". You only need to list
    # the joints that actually affect some vertex in some some shape,
    # to be able to refer to this joint from skinJoints0 array.
    # If a joint is merely a parent for other joints but doesn't *directly*
    # influence any vertex, there's no need to list it here.
    joints [
        USE RootJoint
        USE Body
        USE LeftThigh
        USE LeftShin
        USE LeftFoot
        USE RightThigh
        USE RightShin
        USE RightFoot
    ]
}

6. Pascal example (modify joints by code)

Build and run example examples/animations/animate_bones_by_code to see how to modify joints by Pascal code.

7. More testcases

Simply open any glTF file with a skinned animation in Castle Model Viewer and save it back to X3D. This is a way to have lots of testcases of the Skin node.

Links to many resources with animated glTF models are in our assets page, e.g. look at Quaternius and Sketchfab glTF models.

8. Why?

Our approach is deliberately closely aligned with glTF.

Transforming glTF skinned animation information into X3D Skin is straightforward.
Implementing Skin, including implementing it on GPU (skin is applied in shaders) is as straightforward as in glTF. You can follow the glTF cheat sheet that describes the GPU implementation on skinning in 1.5 pages.

8.1. Difference from H-Anim

8.1.1. Different scope

First of all, the scope of this node is different (and deliberately much smaller) than what H-Anim spec offers.

H-Anim defines various ways to animate humanoids and various conventions how to design humanoid joints (with different levels of articulation).
We consider, in this node, only a single animation technique: skinned animation.

In this node, we’re keeping it agnostic from whether you apply it to humanoids or non-humanoids, like non-humanoid animals or imaginary alien creatures, plants, rubbery machines etc.

If you wonder how to define joints for a humanoid, just follow H-Anim LOA conventions for joints naming and organization. In fact you can use HAnimJoint nodes within our Skin node, we deliberately made it possible (though you can also just use Transform nodes instead of HAnimJoint nodes).

The Skin node simply provides an alternative way to specify skinned animation. We don’t try to fill other use-cases of H-Anim.

8.1.2. Different way to specify skinned animation

X3D standard is integrated with the H-Anim standard, and thus already has a way to perform skinned animation. So why do we invent an alternative way?

Because we think that skinned animation can be expressed a bit simpler than the H-Anim does. We think glTF approach is a good way to do it.

H-Anim is a big specification, that is linked from another big specification (X3D). We rather like the simplicity and efficiency of how skinned animation is defined in glTF. The way skinned animation is expressed in glTF spec and in glTF cheatsheet seems simpler to us.
One reason for this simplicity is that our approach doesn’t invent any new concept for "joints". Our joints are just Transform nodes. We don’t need HAnimJoint.
We add extra flexibility, just like glTF has: you are not limited to one skinCoord in one HAnimHumanoid. One Skin node can influence multiple geometries with completely different Coordinate nodes, thanks to having skinWeights0 and skinJoints0 on each geometry node.
Our Skin, just like glTF approach, has an obvious, simple and efficient GPU implementation. The glTF cheat sheet describes it in 1.5 pages and it’s really easy to implement.
Our design also makes some assumptions, that map to practical usage in our experience, and enable efficient GPU implementation.

Namely, per-vertex weights and joints are stored as 4D vectors, so each vertex has a list of "4 most important joints". See TAbstractComposedGeometryNode.SetSkinWeights0 and TAbstractComposedGeometryNode.SetSkinJoints0 for details. This is enough in practice, in our experience, for even quite complicated gamedev models.

In the future we will likely allow more than 4 joints per vertex, by introducing skinWeights1 and skinJoints1 (and maybe more). Some limit on the number of joints-influencing-a-vertex will in practice always be present, to enable efficient GPU skinning of meshes. That’s also the reason why current fields skinWeights0 and skinJoints0 have 0 at the end.

We realize that this design "uncovers" an implementation detail (it’s efficient to process things, on GPU and CPU, as 4D vectors). But at the same time, experience shows that it’s not troublesome, and even with a limit of "4 joints per vertex", it’s enough in practice for lots of 3D animations to look good.
We want naming that clearly says it’s an animation technique that works with any mesh, humanoid or not.

H-Anim naming implies it’s for humanoids, parts of H-Anim spec talk about joints for humanoids — this makes using HAnimHumanoid for animating arbitrary meshes confusing for the developers. For this reason, we also wanted to have joints/bones simply defined as Transform nodes. This again echoes our desire to keep it simple (standard X3D Transform is a joint), and also we don’t want to connect this to "humanoids", even by terminology. Skinned animation is a standard 2D and 3D animation technique, not specific to humanoids at all.

8.1.3. Similarities to H-Anim (HAnimHumanoid)

There are numerous deliberate similarities between Skin and HAnimHumanoid.

They both have joints list and skeleton fields, with practically the same purpose and meaning.
In both of them, you can use HAnimJoint nodes for joints. In Skin, you can also use simple Transform nodes for joints, but if you want to support both systems in a piece of code generating X3D, you can just use HAnimJoint nodes and later decide do you put them in Skin or HAnimHumanoid.
The placement of Skin and HAnimHumanoid in the transformation matters. This is actually something where we follow X3D and not glTF. It makes sense, for humans and for software, that Skin is part of the transformation hierarchy.

9. How do collisions work when skin is calculated on GPU

9.1. You need to assign proper bounding box explicitly

Since the mesh (skin) is only updated on GPU, the algorithms on CPU don’t know about the updated skin vertexes.

Note	Side note: There are ways to transfer data calculated on GPU back to CPU. Like transform feedback, we have a demo using it in our engine. But we don’t want to apply them here — not only it would complicate things, but also would cost some efficiency.

The solution is to manually define the bounding box for the shapes affected by skin. This is done by setting bboxCenter and bboxSize fields of the shapes. From Pascal code, get/set TAbstractShapeNode.BBox. Manually set them to include all possible skin arrangements in all possible animation frames.

If you don’t do this, the bounding box auto-calculated by the engine will be based on the initial (non-animated) object pose, and it may result in errors: frustum culling and ray picking will consider this box, and e.g. frustum culling may decide to cull (not render) the object when it should actually be visible.

Note

As an alternative to setting proper bounding box using TAbstractShapeNode.BBox (which we recommend) you could also turn off frustum culling by changing both TCastleScene.SceneFrustumCulling and TCastleScene.ShapeFrustumCulling to false (but we really don’t recommend this; it may cost significant performance, unless you only have a few such scenes in your world).

Note	This limitation is not specific to our engine. E.g. three.js also has this, see here and here. In general, calculating a bounding box for skinned-animated objects (whether it is done on CPU or GPU), or updating any mesh collider, is a cost that you usually want to avoid.

9.2. Collisions that detect which body part is hit, using physics

To detect collisions using physics, and detect e.g. whether a ray hits the head or leg of a skinned humanoid, you should attach physical colliders to joints. To do this:

Expose a subset of joints using TCastleSceneCore.ExposeTransforms. See Expose scene elements documentation. This will give you a hierarchy of TCastleTransform that you can attach colliders to.
Attach colliders, like simple TCastleBoxCollider to the exposed transformations.
Make sure to mark that the collider is going to move because of animation (and should not move because of physics forces like gravity):
- Set TCastleRigidBody.Dynamic to false
- Set TCastleRigidBody.Animated to true

10. TODO

Prepare a demo showing the "Collisions that detect which body part is hit, using physics" to detect headshots on an animated model.

Also, switch to ragdoll physics (by adding constraints, and toggling TCastleRigidBody.Dynamic) when character is killed.
Right now, shadow volumes disable calculating skinned animation on GPU.
- This is necessary. To calculate efficient shadow quads, we need to calculate "silhouette edges", which in turn means we need to know all vertexes final coordinates (after skin) on CPU. And we need to have proper shadow quads in every frame (corresponding to the shadow caster rendering in this frame, not some old version), as shadow volumes assume that real mesh provides a "cap" to the geometry rendered into the stencil.
- So for now, when a shape is a shadow caster for shadow volumes, it doesn’t use skinned animation on GPU. Skinned animation is done on CPU.
- Yes, in effect this makes shadow volumes affect performance in a negative way — not only the shadow volumes itself cost (rendering shadow quads, 2 passes), but also it disables skinned animation on GPU.
- To overcome this we’d need to use geometry shaders to render shadow quads, which is quite some additional work.
- Practical solution will likely be to recommend shadow maps once we improve them.

Skinned animation using the Skin node