How to write a Vulkan driver in 2022

How to write a Vulkan driver in 2022

Faith Ekstrand
March 23, 2022

Share this post:

Reading time:

An incredible amount has changed in Mesa and in the Vulkan ecosystems since we wrote the first Vulkan driver in Mesa for Intel hardware back in 2015. Not only has Vulkan grown, but Mesa has as well, and we've built up quite a suite of utilities and helpers for making writing Vulkan drivers easier. This blog post will be a tutorial of sorts (we won't have a functioning Vulkan driver in the end, sorry), showing off a bunch of those helpers and demonstrating the latest Mesa best practices for Vulkan drivers.

Writing a Vulkan driver in 2015

Vulkan in Mesa started with this git commit from Kristian Kristansen who was at Intel at the time:

commit 769785c497aaa60c629e0299e3ebfff53a8e393e
Author: Kristian Høgsberg <krh@bitplanet.net>
Date:   Fri May 8 22:32:37 2015 -0700

    Add vulkan driver for BDW

Kristian, Chad Versace, and I had just pivoted from a different prototype project to working on Vulkan. Kristian started us off with a very skeletal start to a driver with lots of hard-coded values, just barely capable of drawing a triangle. We continued developing the driver internally (Vulkan was still under an NDA at the time) until we were finally able to go public on February 16 of 2016, when the Vulkan spec was released, and the NDA lifted.

At the time we were developing ANV (the Intel Vulkan driver), the Vulkan spec itself was still under development and everything was constantly in flux. There were no best practices; there were barely even tools. Everyone working on Vulkan was making it up as they went because it was a totally new API. Most of the code we wrote was purpose-built for the Intel driver because there were no other Mesa drivers to share code. (Except for the short-lived LunarG Intel driver based in ilo, which we were replacing.) If we had tried to build abstractions, they could have gotten shot to pieces at any moment by a spec change. (We rewrote the descriptor set layout code from scratch at least five or six times before the driver ever shipped.) It was frustrating, exhausting, and a whole lot of fun.

These days, however, the Vulkan spec has been stable and shipping for six years, the tooling and testing situation is pretty solid, and there are six Vulkan drivers in the Mesa tree with more on the way. We've also built up a lot of common infrastructure. This is important both because it makes writing a Vulkan driver easier and because it lets us fix certain classes of annoying bugs in a common place instead of everyone copying and pasting those bugs. So, without further ado, let's get down to it!

Directory structure

First off, every driver needs a name. We're not actually writing one here but it'll make the examples easier if we pretend we are. Just for the sake of example, I'm going to pick on NVIDIA because... Why not? Such a driver is clearly missing and really should happen soon. (Hint! Hint!) We're going to call this hypothetical new Vulkan driver NVK. It's short and obvious. If you don't like me picking on NVIDIA, just pretend it stands for "New VulKan" or nouvulkan, if you prefer.

The first thing we need for this driver is a folder to put it in. Most Vulkan drivers live in src/<vendor>/vulkan. A typical directory structure looks something like this:

src/nvidia/:
 |- meson.build
 |- compiler
 |   |- meson.build
 |   |  ...
 |- vulkan:
 |   |- meson.build
 |   |- nvk_private.h
 |   |- nvk_device.c
 |   |  ...
 |  ...

If there is already a driver for OpenGL or OpenGL ES, it probably lives in src/gallium/drivers/<driver>/. If you want to re-use the compiler (you probably do) then it will have to be moved. You may also want to pull other common components into src/<vendor> such as an image memory layout calculation library, device info structures, or anything else which you want to share. You don't necessarily need to do all the code motion before starting on Vulkan, but you'll want to do it early in the project before it becomes a headache.

Put the following in src/<vendor>/vulkan/meson.build and adjust as needed for your driver:

nvidia_icd = custom_target(
  'nvidia_icd',
  input : [vk_icd_gen, vk_api_xml],
  output : 'nvidia_icd.@0@.json'.format(host_machine.cpu()),
  command : [
    prog_python, '@INPUT0@',
    '--api-version', '1.3', '--xml', '@INPUT1@',
    '--lib-path', join_paths(get_option('prefix'), get_option('libdir'),
                             'libvulkan_nvidia.so'),
    '--out', '@OUTPUT@',
  ],
  build_by_default : true,
  install_dir : with_vulkan_icd_dir,
  install : true,
)

nvk_files = files( )

nvk_deps = [ ]

libvulkan_nvidia = shared_library(
  'vulkan_nvidia',
  [ nvk_files ],
  include_directories : [ inc_include, inc_src, ],
  dependencies : nvk_deps,
  gnu_symbol_visibility : 'hidden',
  install : true,
)

This will build a new shared library, libvulkan_nvidia.so, as well as an ICD file named nvidia_icd.<arch>.json which points to it, when installed. There are many details in here around how Vulkan drivers get loaded on multi-arch systems, which I will ignore because they're very boring.

Dispatch

Before we can start implementing Vulkan entrypoints, we need to set up the dispatch infrastructure. Put the following (and modify as needed) into src/<vendor>/vulkan/meson.build:

nvk_entrypoints = custom_target(
  'nvk_entrypoints',
  input : [vk_entrypoints_gen, vk_api_xml],
  output : ['nvk_entrypoints.h', 'nvk_entrypoints.c'],
  command : [
    prog_python, '@INPUT0@', '--xml', '@INPUT1@', '--proto', '--weak',
    '--out-h', '@OUTPUT0@', '--out-c', '@OUTPUT1@', '--prefix', 'nvk',
  ],
  depend_files : vk_entrypoints_gen_depend_files,
)

This will generate two files: nvk_entrypoints.h and nvk_entrypoints.c. The first contains function prototypes for every Vulkan entrypoint with the vk prefix replaced with <prefix>_. For example, since we passed --prefix nvk to the generation script, vkCreateDevice() will be named nvk_CreateDevice(). The second file, nvk_entrypoints.c contains generated entrypoint tables containing your entrypoints. You don't have to do anything special to declare what entrypoints you actually define. Thanks to a bit of compiler magic, any entrypoints you don't define will show up as NULL in the table.

To ensure these get added into your driver library, you'll need to add nvk_entrypoints to the input list in your shared_library() call and idep_vulkan_util and idep_vulkan_runtime to nvk_deps in your meson.build`:

nvk_files = files( )

nvk_deps = [
  idep_vulkan_runtime,
  idep_vulkan_util,
]

libvulkan_nvidia = shared_library(
  'vulkan_nvidia',
  [ nvk_entrypoints, nvk_files ],
  include_directories : [ inc_include, inc_src, ],
  dependencies : nvk_deps,
  gnu_symbol_visibility : 'hidden',
  install : true,
)

(Note: The weak function pointers used to implement entrypoint tables occasionally break in strange ways depending on link order. The solution is to ensure that anything which pulls in intermediate libraries which contain Vulkan entrypoints is linked with link_whole, unless you're using the Visual Studio compiler. See src/vulkan/runtime/meson.build for more details.)

Setting up the instance

We're about to start defining structs that are part of your new Vulkan driver so we'll need somewhere to put them. Most Vulkan drivers in Mesa today lump everything into <prefix>_private.h because we did that with ANV, and everyone copied+pasted that structure. If you want to be a bit better organized, go for it! We'll use nvk_private.h because I'm boring and don't want to strain my brain just for a blog post.

In nvk_private.h, we'll need to define a nvk_instance struct to hold our instance and any related data, so we'll put the following in nvk_private.h:

#include "nvk_entrypoints.h"
#include "vulkan/runtime/vk_instance.h"
#include "vulkan/runtime/vk_log.h"
#include "vulkan/util/vk_alloc.h"

struct nvk_instance {
   struct vk_instance vk;

   /* Any other stuff you want goes here */
};

VK_DEFINE_HANDLE_CASTS(nvk_instance, vk.base, VkInstance,
                       VK_OBJECT_TYPE_INSTANCE)

As you can see, the first element of our nvk_instance struct is a vk_instance called vk. This acts as the base class for all Vulkan instances in Mesa and stores a bunch of useful stuff for debug logging, dispatch, etc. If you look at the definition of vk_instance, you'll see that its first member is vk_object_base. Every Vulkan object in your driver must be derived from vk_object_base and the base struct must always be the first member. This is because there are a few things which use void pointer casts because of C's lack of support for proper subclassing. However, it's not as bad as you may think because we do have mechanisms for attempting to verify a vk_object_base pointer at runtime, so it's not quite as unsafe as it sounds.

The VK_DEFINE_HANDLE_CASTS macro defines a pair of functions: nvk_instance_to_handle() and nvk_instance_from_handle() which do about what you'd expect: convert a VkInstance to and from a struct nvk_instance *. These also enable the use of the VK_FROM_HANDLE() macro, which we'll see shortly. When converting from a VkInstance to a nvk_instance pointer, we assert at runtime that the object type is VK_OBJECT_TYPE_INSTANCE to provide a bit of added type safety because some handle types just map to uint64_t and so have no real compile-time type information.

Now that we have the header file in shape, it's time for some code. We'll create a new file called nvk_device.c for all our instance and device-level stuff. (Again, yes, we could stand to be better organized.) We'll start with our table of supported instance extensions:

#include "nvk_private.h"

static const struct vk_instance_extension_table instance_extensions = {
   .KHR_get_physical_device_properties2   = true,
   .EXT_debug_report                      = true,
   .EXT_debug_utils                       = true,
};

You may want more than this eventually, but these three you'll want to implement right away. Both VK_EXT_debug_report and VK_EXT_debug_utils will be implemented for you if you use the right base structs. All you have to do is advertise them. KHR_get_physical_device_properties2 is one you'll have to implement, but it's basically required for Vulkan these days, so there's no sense in waiting.

Next, we implement nvk_CreateInstance():

VKAPI_ATTR VkResult VKAPI_CALL
nvk_CreateInstance(const VkInstanceCreateInfo *pCreateInfo,
                   const VkAllocationCallbacks *pAllocator,
                   VkInstance *pInstance)
{
   struct nvk_instance *instance;
   VkResult result;

   if (pAllocator == NULL)
      pAllocator = vk_default_allocator();

   instance = vk_alloc(pAllocator, sizeof(*instance), 8,
                       VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
   if (!instance)
      return vk_error(NULL, VK_ERROR_OUT_OF_HOST_MEMORY);

   struct vk_instance_dispatch_table dispatch_table;
   vk_instance_dispatch_table_from_entrypoints(
      &dispatch_table, &nvk_instance_entrypoints, true);

   result = vk_instance_init(&instance->vk, &instance_extensions,
                             &dispatch_table, pCreateInfo, pAllocator);
   if (result != VK_SUCCESS) {
      vk_free(pAllocator, instance);
      return result;
   }

   /* Initialize driver-specific stuff */

   *pInstance = nvk_instance_to_handle(instance);

   return VK_SUCCESS;
}

Let's start with allocation. Most vulkan entrypoints which create or destroy an object take a VkAllocationCallbacks pointer, which you're supposed to use to allocate memory for the object. Working with these manually is tedious at best so we provide helpful vk_alloc/free which allocate with respect to the requested allocator. The vk_alloc2/free2 versions take two allocators and implement the required fall-back. We also provide vk_default_allocator() which is an allocator that maps everything to the C standard library malloc/free(). These and a few other nifty allocation helpers can be found in src/vulkan/util/vk_alloc.h.

Before we can actually initialize the base vk_instance, we need to convert our entrypoint table to a dispatch table. The entrypoint table generator we invoked earlier generates a vk_entrypoint_table but vk_instance_init wants a vk_dispatch_table. What's the difference, and why are there two of them? That's a topic for another day. The short version is that the conversion deals with de-duplicating entrypoints from when an extension gets promoted.

Finally, we can actually initialize our vk_instance by calling vk_instance_init(). This function does a bit more than just initialize a data structure. It sets up all the logging infrastructure for instance create logging through VK_EXT_debug_utils. It also does a few Vulkan API version number checks and checks to ensure that every extension specified by VkInstanceCreateInfo::ppEnabledExtensionNames is actually supported by your implementation and returns VK_ERROR_EXTENSION_NOT_PRESENT if an unsupported extension is requested.

And there you go! You've created your first VkInstance object. For completeness, we should also implement vkDestroyInstance():

VKAPI_ATTR void VKAPI_CALL
nvk_DestroyInstance(VkInstance _instance,
                    const VkAllocationCallbacks *pAllocator)
{
   VK_FROM_HANDLE(nvk_instance, instance, _instance);

   if (!instance)
      return;

   vk_instance_finish(&instance->vk);
   vk_free(&instance->vk.alloc, instance);
}

Before we brush past it, there is one interesting thing here: the call to VK_FROM_HANDLE(). This macro declares a pointer to a struct nvk_instance and initializes it with nvk_instance_to_handle(_instance). Because these sorts of casts are so prevalent in a Vulkan driver, especially at the tops of entrypoints, having a macro for it helps the ergonomics a good bit.

Now that we have an instance, we can implement vkGetInstanceProcAddr() trivially as it's just a wrapper around a helper provided in vk_instance.h:

VKAPI_ATTR PFN_vkVoidFunction VKAPI_CALL
nvk_GetInstanceProcAddr(VkInstance _instance,
                        const char *pName)
{
   VK_FROM_HANDLE(nvk_instance, instance, _instance);
   return vk_instance_get_proc_addr(&instance->vk,
                                    &nvk_instance_entrypoints,
                                    pName);
}

PUBLIC VKAPI_ATTR PFN_vkVoidFunction VKAPI_CALL
vk_icdGetInstanceProcAddr(VkInstance instance,
                          const char *pName);


PUBLIC VKAPI_ATTR PFN_vkVoidFunction VKAPI_CALL
vk_icdGetInstanceProcAddr(VkInstance instance,
                          const char *pName)
{
   return nvk_GetInstanceProcAddr(instance, pName);
}

The last bit adds a wrapper, so we also expose the more loader-friendly vk_icdGetInstanceProcAddr(). For more details about the loader interface, see the loader driver interface doc.

Logging

Before we get into creating other objects, we should explain that vk_error() call in nvk_CreateInstance(). This is part of the broader common logging framework found in src/vulkan/runtime/vk_log.h. Any messages logged through this framework automatically get broadcast to stderr in debug builds, the Android logging framework if you've built for Android, and passed to the client via VK_EXT_debug_report and VK_EXT_debug_utils.

For most log messages, use the vk_log* family of macros. These are printf-like macros and support anything that your C standard library's printf() call does. For instance, to log a debug-level message on a device, do:

vk_logd(VK_LOG_OBJS(device), "vkDeviceWaitIdle() took %u us", wait_time);

The VK_LOG_OBJS() macro can take up to 8 objects which will be passed along with the message to any VkDebugUtilsMessangerEXTs. If you want to log something on the instance only, with no objects, you can use VK_LOG_NO_OBJS(instance) instead. If VK_LOG_NO_OBJS(NULL) is used, then the message will only go to stderr and the Android logging framework because we can't get the list of messengers.

For errors, there is a special vk_error() macro which takes a VkResult and generates a log message containing the error. If you want to provide additional information in the log message, the vk_errorf() macro is a printf-like macro which generates the same message as vk_error() but with your log message appended. The typical pattern is to wrap each error in a vk_error() or vk_errorf() wherever the error was originally generated. If you're propagating errors from some other function, there's no need to wrap because it's already been logged. We already saw one example when creating our instance above:

instance = vk_alloc(pAllocator, sizeof(*instance), 8,
                    VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
if (!instance)
   return vk_error(NULL, VK_ERROR_OUT_OF_HOST_MEMORY);

Here's another example from ANV's vkMapMemory() implementation:

if (mem->map != NULL) {
   return vk_errorf(device, VK_ERROR_MEMORY_MAP_FAILED,
                    "Memory object already mapped.");
}

Both macros take an object as their first parameter, the object generating the error. For object creation errors (like out of memory), this should be the parent object, typically a device or instance. If you don't have a parent object (such as in vkCreateInstance()), you can pass in NULL. As with other logging, this means that it won't show up in any client call-backs.

Physical devices

Physical devices look much the same as instances:

struct nvk_physical_device {
   struct vk_physical_device vk;

   /* Driver-specific stuff */
};

VK_DEFINE_HANDLE_CASTS(nvk_physical_device, vk.base, VkPhysicalDevice,
                       VK_OBJECT_TYPE_PHYSICAL_DEVICE)

We won't spend too much time on the struct, initialization, etc. If you've written much C code at all, you know the pattern. One difference between physical devices an instances is when they are created. The instance has an explicit vkCreateInstance() entrypoint whereas physical devices get created implicitly at some unknown time between vkCreateInstance() and the first call to vkEnumeratePhysicalDevice(). Most Mesa Vulkan drivers do the actual walking of /dev/dri and creation of corresponding physical devices as part of the first vkEnumeratePhysicalDevice() to make instance creation faster. I don't know that this has any tangible benefits but it's the common pattern in Mesa today.

One other difference from instance initialization which may be useful is that, while vk_physical_device_init() takes a vk_device_extensions struct of supported device extensions, you can also pass NULL and it will simply memset() the table to all false so you can fill it out yourself later. This is because determining feature support often requires that a bunch of the physical device initialization work has already been done. Allowing drivers to re-order initialization such that determining feature support happens late in the process makes things a bit easier. I don't personally like this and, one day, I'd like to have a better-defined point at which the vk_physical_device is fully initialized, but it's convenient for now.

If a client is going to use your physical device, they're going to need to know about supported features, so the next step is to implement vkGetPhysicalDeviceFeatures2() and vkGetPhysicalDeviceProperties2(). Note the 2. Don't bother implementing the original Vulkan 1.0 calls unless you really want to. If it's missing from your dispatch table, we'll implement it for you in terms of vkGetPhysicalDeviceFeatures2() or vkGetPhysicalDeviceProperties2(). Here's what a vkGetPhysicalDeviceFeatures2() implementation might look like:

VKAPI_ATTR void VKAPI_CALL
nvk_GetPhysicalDeviceFeatures2(VkPhysicalDevice physicalDevice,
                                 VkPhysicalDeviceFeatures2 *pFeatures)
{
   VK_FROM_HANDLE(nvk_physical_device, pdevice, physicalDevice);

   pFeatures->features = (VkPhysicalDeviceFeatures) {
      .robustBufferAccess = true,
      /* More features */
   };

   VkPhysicalDeviceVulkan11Features core_1_1 = {
      .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_1_FEATURES,
      /* Vulkan 1.1 features */
   };

   VkPhysicalDeviceVulkan12Features core_1_2 = {
      .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_2_FEATURES,
      /* Vulkan 1.2 features */
   };

   VkPhysicalDeviceVulkan13Features core_1_3 = {
      .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_3_FEATURES,
      /* Vulkan 1.3 features */
   };

   vk_foreach_struct(ext, pFeatures->pNext) {
      if (vk_get_physical_device_core_1_1_feature_ext(ext, &core_1_1))
         continue;
      if (vk_get_physical_device_core_1_2_feature_ext(ext, &core_1_2))
         continue;
      if (vk_get_physical_device_core_1_3_feature_ext(ext, &core_1_3))
         continue;

      switch (ext->sType) {
      case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_4444_FORMATS_FEATURES_EXT: {
         VkPhysicalDevice4444FormatsFeaturesEXT *features = (void *)ext;
         features->formatA4R4G4B4 = true;
         features->formatA4B4G4R4 = true;
         break;
      }
      /* More feature structs */
      default:
         break;
      }
   }
}

Most of this is pretty self-explanatory, but there's some bits around core features and properties that we should discuss. The vk_get_physical_device_core_1_1_feature_ext() call at the top of our loop checks to see if the provided extension struct is VkPhysicalDeviceVulkan11Features or is from an extension promoted to core in Vulkan 1.1 and, if it is, fills it out from the VkPhysicalDeviceVulkan11Features struct we passed in, and returns true to let us know it's been handled. There are helpers like this for every major Vulkan version. This allows us to avoid any possible mismatches between features advertised through extensions and those advertised through the core structs.

One thing you don't need to worry about implementing is vkEnumerateDeviceExtensionProperties(). Because the table of supported extensions lives in the vk_physical_device, we can implement that one for you in common code. This saves you the headache of dealing with extension name strings and lets us handle certain annoying Android corner cases for you.

To satisfy the loader driver interface requirements, you'll also need to implement vk_icdGetPhysicalDeviceProcAddr():

PUBLIC VKAPI_ATTR PFN_vkVoidFunction VKAPI_CALL
vk_icdGetPhysicalDeviceProcAddr(VkInstance  _instance,
                                const char* pName);

PUBLIC VKAPI_ATTR PFN_vkVoidFunction VKAPI_CALL
vk_icdGetPhysicalDeviceProcAddr(VkInstance  _instance,
                                const char* pName)
{
   VK_FROM_HANDLE(nvk_instance, instance, _instance);
   return vk_instance_get_physical_device_proc_addr(&instance->vk, pName);
}

Common implementation of entrypoints

I mentioned above, without much explanation, that if you don't implement vkGetPhysicalDeviceFeatures() that it will get implemented in terms of vkGetPhysicalDeviceFeatures2(). How does this work? I'm so very glad you asked! Let's take a moment to properly explore this.

Inside the Vulkan runtime code in Mesa located in src/vulkan/runtime, we have a set of entrypoint tables using the vk_common_ prefix. These are used to implement a variety of functionality ranging from trivial wrappers to full complex implementations of whole object types. As part of vk_instance_init() or vk_device_init(), we look for NULL function pointers in your dispatch table and fill them with common implementations if a common implementation exists. We do this regardless of whether or not you actually support the feature and trust the vkGet*ProcAddr() implementation to return NULL for unsupported entrypoints. If you're looking to see if the runtime code has an implementation of vkFoo(), search for vk_common_Foo() and that should find it.

One of the most common uses of this is to implement vkFoo() in terms of vkFoo2(). If you're implementing something where the latest Vulkan spec has both a vkFoo() and a vkFoo2(), you should always skip vkFoo() and go straight to vkFoo2() unless you really want to implement both. This will work even if you don't expose the Vulkan core version or extension that provides vkFoo2(). If a wrapper does not yet exist for the entrypoint you're implementing, make a merge request to add one.

Devices and Queues

Device creation looks much the same as instance creation as far as the common Vulkan runtime code goes. There is an explicit vkCreateDevice() entrypoint in which you need to create your device and vk_device_init takes a dispatch table and pCreateInfo just like vk_instance_init(). As with instances, vk_device_init() checks to ensure that every extension specified by VkDeviceCreateInfo::ppEnabledExtensionNames is actually supported according to the table of supported extensions in the vk_physical_device and returns VK_ERROR_EXTENSION_NOT_PRESENT if an unsupported extension is requested.

With your device, you'll also need to create queues. These also have a base struct called vk_queue. As with physical devices, queues are weird in that the spec doesn't say when they get created. It must be between vkCreateDevice() and the first call to vkGetDeviceQueue() which requests it. All Mesa drivers create them as part of vkCreateDevice(). This allows us to maintain a list of queues inside the vk_device and implement vkGetDeviceQueue() and vkGetDeviceQueue2() for you.

Of course, your devices and queues will need to be more than a bare vk_device or vk_queue. You'll likely also need data structures for memory allocation, handles to various kernel driver resources, etc. Then, you'll need to implement the various APIs around allocating and binding memory, creating images, buffers, and other Vulkan objects, etc. We won't cover any of that in detail here because there's not much we can do to help in common code just yet; it's mostly vendor-specific.

Synchronization

Before you get too excited and start to implement VkFence, VkSemaphore, and vkQueueSubmit(), stop. Unless you are a very special driver such as venus, which is implementing Vulkan pass-through for VMs, you should not be implementing synchronization yourself. You will get it wrong.

Instead, we have a common synchronization framework built around vk_sync objects. In vk_physical_device, there is a supported_sync_types array which describes each vk_sync_type supported by your implementation. If your driver uses DRM sync objects (it should!), a vk_sync implementation is provided for you in vk_drm_syncobj.h. For any other synchronization type such as BO-based synchronization for talking with X11 or amdgpu's internal sync handles, you can implement your own vk_sync_type, which describes the capabilities of the synchronization primitive and provides hooks for various operations such as waiting on it or signaling it from the CPU. Once you've provided some vk_sync_types, VkFence and VkSemaphore are taken care of for you by common code.

Your driver will also need to fill out vk_queue::driver_submit with a function that handles submission to the kernel driver. We provide implementations of vkQueueSubmit() and vkQueueBindSparse() in terms of this hook. This allows us to implement both userspace emulated and kernel assisted timeline semaphores for you. The cross-process negotiation that makes timeline semaphores on Linux possible is extremely tricky and subtle; you will get it wrong if you try to do it yourself.

If you want timeline semaphores, there are currently two options on Linux. The first and better option is to implement DRM sync object timeline support in your kernel driver. If you do, vk_drm_syncobj_get_type() will return a type with syncobj support, VkSemaphore will advertise cross-process sharing support for timelines, and we'll take care of the negotiation stuff for you. If your kernel doesn't support timeline DRM sync objects, you can use vk_sync_timeline, which emulates timelines using another binary vk_sync_type. The former is preferred because it supports sharing, but the emulation may be necessary for supporting Vulkan 1.2 on older kernel drivers.

Everything I've outlined so far is about Linux, but it should work fine on Windows too. Things are actually quite a bit simpler there because the built-in synchronization primitive already does timelines. When working on the synchronization framework, I typed up support for WDDM2 fences to prove that it works. It's sitting in a WIP state and isn't exactly well-tested because we don't have any Windows drivers to use it yet, but it should work.

Command buffers and pools

We provide base vk_command_pool and vk_command_buffer structs which you should use. These are required for our generic implementation of VK_EXT_debug_utils and its command buffer tagging. You also get a few little things for free such as vkResetCommandPool().

The other big thing that using vk_command_buffer will get you soon is a common capture/replay solution for secondary command buffers. Boris Brezillon at Collabora has a merge request posted, which implements this and enables it for panvk. Implementing secondary command buffers directly is recommended since it's likely better for performance on most hardware. However, some hardware really can't do better than capture/replay, in which case this will let us handle all those capture/replay details in common code.

Images and views

There are also vk_image and vk_image_view base structs which you can optionally use in your driver. Unlike most of the base objects we've discussed so far, these don't really do anything. They just hold copies of all the image and view creation parameters for you. However, given the number of places in the API where a size, format, or number of layers is pulled implicitly from the image view, you'll need at least some of this information on hand. We also deal with the image usage vs. view usage distinction added in VK_KHR_maintenance1 for you in case that matters for your driver.

Even though these structs are currently optional, and the immediate value isn't huge, I expect we'll be building more common code which uses them in the future. You'll probably want to get onboard eventually.

Render passes

Whether or not you need real render passes is a decision only you can make. Some hardware gets real benefit from re-ordering and combining subpasses within a render pass. However, for most desktop hardware, there's no real point and we just want to blast commands into a buffer. If you don't care about subpass re-ordering or combining, then you don't need to bother implementing render passes at all.

Landed just this week is a merge request which implements all of render passes in terms of VK_KHR_dynamic_rendering. All you have to do is implement vkCmdBeginRendering() and vkCmdEndRendering(). For the couple of places where render passes are passed into the API, there are helpers which return you a VK_KHR_dynamic_rendering version of the relevant data. For driers which don't care about subpass combining, this is a fantastic simplification. The ANV (Intel) patch is +926/-2633 lines of code and the new code is not only shorter but way easier to read and understand.

If you choose to use the common render pass implementation, your driver will need to use vk_image and vk_image_view as we need to be able to introspect images and views.

Final comments

Before we wrap up, there's a few odds and ends that should be addressed but don't really deserve their own section:

We haven't discussed compilers at all. There's a huge common infrastructure for that in Mesa called NIR which you'll be using. You can read about it in my recent blog post about NIR. Before anyone asks, no, you cannot bring your own SPIR-V parser and just use LLVM.
We implement VK_EXT_private_data for you. This is a big part of why we made vk_object_base in the first place. Assuming you used base objects everywhere, everything needed for the extension is already in place. Just turn it on.
I'm working on a common VkPipelineCache implementation which has been making progress but isn't quite ready yet.
There are common vk_shader_module and vk_framebuffer objects that simply capture the input parameters. These are optional and there for you to use if you want. Not much relies on these yet. There is a SPIR-V to NIR helper which uses vk_shader_module if you choose to use it, but there is also a version that takes the SPIR-V directly so using vk_shader_module isn't strictly required.

And that's about it for now. Mesa will continue to evolve and the core Vulkan runtime code and best practices will evolve with it. New stuff is getting added constantly. By 2024 or so, we may have enough fresh material to justify another post like this. Until then, if you're developing a Vulkan driver in Mesa, best keep your eyes on merge requests tagged "vulkan" to make sure you don't miss any cool new updates that gets added. Happy hacking!

Venus on QEMU: Enabling the new virtual Vulkan driver

Bridging the OpenGL and Vulkan divide

PanVk: An Open Source Vulkan driver for Arm Mali Midgard and Bifrost GPUs

Venus on QEMU: Enabling the new virtual Vulkan driver

Bridging the OpenGL and Vulkan divide

PanVk: An Open Source Vulkan driver for Arm Mali Midgard and Bifrost GPUs

Comments (3)

Quinten Kock:
Apr 07, 2022 at 05:46 AM

Very nice article, thank you very much! I've been trying to get my hands dirty with writing a Vulkan driver.

I have noticed that there is a small mistake in nvk_CreateInstance, namely that pInstance is not used. Looking at Lavapipe told me that I needed to set it using "*pInstance = nvk_instance_to_handle(instance)", otherwise the system Vulkan loader (or vk_instance_get_proc_addr?) refuses to pick up on the instance.

Reply to this comment

Reply to this comment
Mauro Rossi:
Jul 29, 2024 at 08:10 AM

Hi Faith, I'm working on Android vulkan HAL for nvk (vulkan.nouveau.so module)

I am going to send a MR to mesa soon

VF_ANDROID_native_buffer (ANB) for NVK is detected by VulkanCapsViewer
https://vulkan.gpuinfo.org/displayreport.php?id=32174

and passing Android CTS dEQP-VK with these results

Test Results:
Module Passed Failed Total
dEQP-VK 568783 116 568899

The problem is that except VulkanCapsViewer (which is not rendering to screen by means of libvulkan.so loader and vulkan.nouveau.so) all the Vulkan apps I have tested present a black screen and give these errors

07-17 00:28:36.802 2137 2137 E BufferQueueConsumer: [tech.incr.vulkanandroid/tech.incr.vulkanandroid.MinimalNativeActivity#0](id:85900000010,api:1,p:5060,c:2137) acquireBuffer: max acquired buffer count reached: 2 (max 1)
07-17 00:28:36.802 2137 2137 E BufferLayerConsumer: [tech.incr.vulkanandroid/tech.incr.vulkanandroid.MinimalNativeActivity#0] updateTexImage: acquire failed: Function not implemented (-38)

while some others vkcube and gearsvk generate these errors

07-28 16:15:05.285 1023 1023 E RenderEngine: failed to wait on fence fd
...
07-28 16:15:05.468 1023 1023 E Layer : [Surface(name=Task=136)/@0x832783c - animation-leash#0] No local sync point found
07-28 16:15:05.468 1023 1023 E Layer : [Surface(name=Task=1)/@0xd82067 - animation-leash#0] No local sync point found

Could you please provide me some additional information about Android vulkan HAL and if the current src/vulkan/runtime common code and src/vulkan/runtime/vk_physical_device_properties_gen.py should not require any custom nvk_[Function] and if vk_drm_syncobj.h is already implemeneted in the current src/nouveau/vulkan code?

Reply to this comment

Reply to this comment
1. Faith Ekstrand:
  Jul 29, 2024 at 01:33 PM
  
  First off, I think it'll work better to have this discussion if you file a bug on https://gitlab.freedesktop.org/mesa/mesa/-/issues and we can chat there. The short version is that NVK shouldn't be missing anything fundamental. However, you will need to add the Android pieces. You can see examples of this in other drivers like anv_android.c, tu_android.c and radv_android.c.
  
  Reply to this comment
  
  Reply to this comment

Add a Comment

Search the newsroom

Latest Blog Posts

PipeWire workshop 2025: Updates on video transport, Rust efforts, TSN networking, and Bluetooth support

03/07/2025

As part of the activities Embedded Recipes in Nice, France, Collabora hosted a PipeWire workshop/hackfest, an opportunity for attendees…

Coccinelle for Rust progress report

25/06/2025

In collaboration with Inria, the French Institute for Research in Computer Science and Automation, Tathagata Roy shares the progress made…

Linux Media Summit 2025 recap

23/06/2025

Last month in Nice, active media developers came together for the annual Linux Media Summit to exchange insights and tackle ongoing challenges…

Constructor acquires, destructor releases

09/06/2025

In this final article based on Matt Godbolt's talk on making APIs easy to use and hard to misuse, I will discuss locking, an area where…

What if C++ had decades to learn?

21/05/2025

In this second article of a three-part series, I look at how Matt Godbolt uses modern C++ features to try to protect against misusing an…

Unleashing gst-python-ml: Python-powered ML analytics for GStreamer pipelines

12/05/2025

Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…

About Collabora

Whether writing a line of code or shaping a longer-term strategic software development plan, we'll help you navigate the ever-evolving world of Open Source.

한국의 국기 한국어 버전의 Collabora.com 보기