Descriptor Sets and Pain

GPU work usually begins at the vertices and providing that data to the GPU. In the previous post we discussed vertex buffers, their API design in Metal and Vulkan, and how they can be made better if we make concessions about not supporting every potential vertex buffer layout.
GPU work also in most cases requires data that is typically not per vertex but rather applies to the entire model/vertices set. We refer to such data as GPU resources. They can be roughly divided into a few types:

Uniform/Constant data: This is a piece of data that is read only on the GPU and can be provided to each of the stages of the graphics pipeline. an example would be the projection matrix for the camera.
Textures: These are typically read-only image data and are usually sampled when coloring/texturing a model.
- We will consider render targets that are written by a previous GPU workload as just being a texture.
Storage Buffers/Textures: These are buffers and textures that are read-write rather being read-only. They are typically much bigger in size to Uniform buffers due to GPU limitation.
Samplers: These are GPU objects that are used together with Textures to define the ways how a texture should be sampled.

Each GPU graphics/compute stages can have access to any of the above types. Providing those resources usually has two parts. One part is the Graphics API and the other part is the syntax in the shading language.

Graphics API for GPU Resources

Each API takes a different approach when providing the data. Its tightly coupled with the shader language that is commonly used with that graphics API. For Vulkan this would typically be GLSL, for Metal it would be MSL. While its possible to use HLSL with Vulkan, I won't explore this as GLSL has more features when it comes to Vulkan.

In general the main pieces of information we need to encode for a resource are these:

The type of the resource, i.e Uniform, Texture, Storage Buffer.
The stages where the resource is used, i.e which stages of the pipeline can access this specific resource.
The location/binding point specified in the shader language, i.e how does the shader specify the resource.
The encoding used by the API, i.e what structs/data you need to specify in the client code.

Vulkan

In Vulkan to specify a set of resources to the GPU you need the following structs, this is specified in logical order:

VkDescriptorSetLayoutCreateInfo
VkDescriptorSetLayoutBinding
VkDescriptorPoolSize
VkDescriptorPoolCreateInfo
VkDescriptorSetAllocateInfo
VkWriteDescriptorSet
VkDescriptorImageInfo
VkDescriptorBufferInfo
VkPipelineLayoutCreateInfo

1 and 2 are used to specify what is essentially a Descriptor Set's layout, this is usually matched by the shader.
3, 4, 5 are used to allocate GPU memory and to allocate a Descriptor Set.
6, 7, 8 are used to write our resource data into the Descriptor Set.
Finally 9 is used to communicate to the pipeline object the layout of the descriptors that comes from the shader.

There are other structs that come into play but at the bare minimum these are required.

GLSL Specification of resource types

In GLSL resources are specified by using a few keywords:

uniform : This is usually used to specify that a piece of data is read only.
texture, sampler, sampler2D: these are used to specify texture types, they can be read/write. In Vulkan while samplers can be separated, they are usually combined with their texture into one. For a 2D texture the type is called sampler2D.
buffer: to specify a storage object

If we follow with the assumption of a non-bindless structure, i.e we use binding slots for our resources, then GLSL provides a few attributes to specify the bind slots:

layout(binding = <n>, set = <n>)

binding specifies the bind slot while set acts as a namespace for the binding slots, this set is reflected in the API for Vulkan.

A Vulkan example

The best way to understand all of the above for Vulkan is to create an example. We will have two sets, each with only one bind slot, first set will have a buffer while the second set will have a texture.

VkDescriptorSetLayoutBinding buffer_bind = {
    .binding = 0,                                      // bind slot
    .descriptorType = <uniform_buffer>,                // the type
    .descriptorCount = 1,                              // not covered
    .stageFlags = <which stages can use this binding>, // stages
};

VkDescriptorSetLayoutCreateInfo set_one_layout = {
    ..
    .bindingCount = 1,
    .pBindings = buffer_bind,
};

VkDescriptorSetLayoutBinding texture_bind = {
    .binding = 0,                                      // bind slot
    .descriptorType = <texture>,                       // the type
    .descriptorCount = 1,                              // not covered
    .stageFlags = <which stages can use this binding>, // stages
};

VkDescriptorSetLayoutCreateInfo set_two_layout = {
    ..
    .bindingCount = 1,
    .pBindings = buffer_bind,
};

// Pool for the descriptors
VkDescriptorPoolSize pool_sizes[2] = {
    // buffer type
    {
        .type = <uniform_buffer>,
        .descriptorCount = 1
    },
    
    // texture size
    {
        .type = <texture>, 
        .descriptorCount = 1
    }
};

VkDescriptorPoolCreateInfo pool_create_info = {
    ..
    .maxSets = 2, // how many sets can be created from this pool
    .poolSizeCount = 2,
    .pPoolSizes = pool_sizes;
};

VkDescriptorSetAllocateInfo set_allocation = {
    ..
    .descriptorPool = <pool_created>
    .descriptorSetCount = 2;
    .pSetLayouts = <array_of_set_layouts>
};

VkWriteDescriptorSet writes [2] = {
    // buffer write to set one
    {
        ...
        .dstSet = <set_one>,
        .dstBinding = 0, 
        .descriptorCount = 1,
        .descriptorType = <uniform_buffer>,
        .pBuffer_info = &(VkDescriptorBufferInfo) {
            .buffer = <uniform_buffer_object>,
            .offset = <offset_inside_buffer>,
            .range = <size_of_access>
        }
    },
    
    // texture write to set two
    {
        ...
        .dstSet = <set_two>,
        .dstBinding = 0, 
        .descriptorCount = 1,
        .descriptorType = <texture_uniform>,
        .pImageInfo = &(VkDescriptorImageInfo) {
            .imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL,
            .imageView = <texture_view>,
            .sampler   = <sampler_for_the_texture> // only needed if its a combined image with a sampler 
        }
    },
};

VkPipelineLayoutCreateInfo pipeline_layout_info = {
    ..
    .setLayoutCount = 2,
    .pSetLayouts = <array_of_set_layouts>,
};

As can be obsereved this is quite verbose as is usual for Vulkan. The pool objects exist because GPU side memory must be allocated and this gives vendors a way to do it and surface this information. In an ideal world, the pool should be a few mallocs but this world doesn't exist.

Most of this is self explanitory with a gotcha as to which is set 0 and which is set 1, these values are taken from the order they are passed to the VkPipelineLayoutCreateInfo. i.e the first set layout will become set 0 (set one) and set 1 (set two) comes from the second layout

Overall its quite atrocious and very verbose how this is done.

Point 1, 2 is specified by the SetLayoutBinding's descriptorType and stageFlags fields.
Point 3 is specified by the layout(binding = <n>, set = <n>).
Point 4 is specified by the VkPipelineLayoutCreateInfo's pSetLayouts

Metal

In metal because the hardware is fully controlled by Apple the API is much much simpler. MSL also lacks the idea of sets so it becomes even easier.

For metal, once you have created your MTLTexture and MTLBuffer objects, you can just call setVertexBuffer, setFragmentBuffer, setVertexTexture, setFragmentTexture to bind the buffers and textures.

Specify Resources in MSL

In MSL you can use the [[texture(<n>)]] and [[buffer(<n>)]] to specify the slots, obviously there isn't any set so there aren't any namespaces for these slots There are other options to make life easier such as using ArgumentBuffers however forcing the client to use an ArgumentBuffer isn't very friendly and having the ability to specify both in slots and in ArgumentBuffers is much more preferable.

A Metal example is not needed in this case because its so straight forward.

Point 1 is specified by the type of the resource used MTLTexture MTLBuffer Point 2 is split into set<stage>Texture, set<stage>Buffer functions. Point 3 is specified by [[texture(N)]] and [[buffer(N)]] attribute specifiers in the shader side. Point 4 is not needed for the textures but is specified via fragmentBuffers and vertexBuffers fields

A Saner approach

A much better API can be teased out in here to unify these two, fortunately it means less vulkan code and unfortunately it means a bit more info for the metal part. I propose that we can get away with far less structs, however there might be some sub-optimal ways of allocating some memory.

I will introduce a concept of Bind Group a bind group is essentially a set, it contains a set of resources.

This is what the sh_bind_group_t struct looks like:

typedef struct sh_uniform_t {
    sh_str name;              // For debugging
    sh_uniform_type_t type;   // The type of the resource
    sh_stage_e stages;        // Stages specification

    union {                   // Just a pointer to the types we currently work with
        sh_buffer_t *buffer;
        sh_texture_t *texture;
        sh_render_target_t *render_target;
    };
    sh_sampler_t *sampler;   // Sampler for the texture

    // some fields here that are not as important for now
    ...
} sh_uniform_t;

typedef struct sh_uniform_array_t {
    sh_uniform_t *data;
    u32 count;
} sh_uniform_array_t;

typedef struct sh_bind_group_t {
  sh_str name;                     // Debug name
  sh_uniform_array_t binds;        // the resources array
    u32 *dynamic_offsets;            // Ignore this
  sh_platform_bind_group_t handle; // Platform pimpl
} sh_bind_group_t;

The sh_uniform_array_t is purely to make life easier with a macro.

This is what a typical bind group looks like:

sh_bind_group_t camera_info = {
    .name = "camera_info",
    .binds = sh_array_input((sh_uniform_t[]) {
        [0] = { .name = "cam", .type = SH_UNIFORM_BUFFER, .stages = SH_STAGE_VERTEX, .buffer = &camera_buffer},
    })
};
sh_bind_group_create(&ctx, &camera_info);

and when specifying this to the pipeline object we only have this:

typedef struct sh_bind_group_info_t {
    sh_bind_group_t *bind_group;
    b8 no_bind;
} sh_bind_group_info_t;

sh_pipeline_t pipeline = {
    ...
    .bind_groups = sh_array_input((sh_bind_group_info_t[]){
        [0] = { &camera_info },
    })
}

The sh_bind_group_info_t is here to A, make life easier a bit with the macro, and B also because I wanted to bind bind_groups automatically when you bind pipelines, but also wanted to override this default behavior via no bind flag. Its possible to remove this struct and introduce sh_pipeline_bind_with_bind_groups function for this purpose and this might be more ergonomic and less typing. My idea is that when you bind a pipeline you also want to bind the bind groups necessarily, to make life easier and type less code, i.e not write sh_bind_group_bind for every bind group however an argument can be made that you want this to be an explicit choice.

Point 1, 2 is handled by the type and stages field of the sh_uniform_t struct. Point 3 is handled by the shaders Point 4 is handled by the bind_groups of the sh_pipeline_t struct.

The bind slots are essentially the index of the items inside the .binds field of the sh_bind_group_t where each array entry corrosponds to a bind slot. an empty location (where type and stags is invalid) just means skip this slot.

Note: On Vulkan's side for every bind group I create I also create a separate Pool and one set from it exactly. This is highly suboptimal. On the flip side on metal the creation is essentially nop beyond some integer creation.

Flatten Sets

Unfortunately the idea of sets is unique to Vulkan which means on other languages and platforms we need to fake this. My approach here is as follows:

When creating the pipeline, I go through every bind group, and number the slots sequentially. i.e if I have two texture in set 0 and a texture in set 1, then the two textures in set 0 get slot 0 and slot 1 while the texture in set 1 is set to slot 2.
To take into account the stages and slots in the shader stages, i.e where a texture can be slot 1 in the vertex shader while its slot 0 in the fragment shader, I create a multi-dimentional array.
This multi-dimentional array is 3D however its specified for each bind group, One axis is the stage, one axis is the resource type, the final axis is the slot index.
When a pipeline is bound we just traverse this multi-dimentional array to bind the slots.
When binding a specific bind group, we must pass in the set number which we will use to look up the slots we calculate previously and bind this accordingly

Conclusion

Creating a better API is hard. I'm not 100% certain that this is the best solution but so far this is the best solution I have come up with. There are many fields and options to consider and these are part of the sh_uniform_t struct.
This API gives us a nice way to specify sets in Vulkan but also if need be make argument buffers on metal which are just normal buffers with other resource pointers inside of it.