Helen Koike
June 13, 2017
Reading time:
When I last wrote about NVMe, the feature to improve NVMe performance over emulated environments was just a living discussion and a work in progress patch. However, it has now been officially released in the NVMe Specification Revision 1.3 under the name "Doorbell Buffer Config command", along with an implementation that is already in the mainline Linux kernel! \o/
You can already feel the difference in performance if you compile Kernel 4.12-rc1 (or later) and run it over a virtual machine hosted on Google Compute Engine. Google actually updated their hypervisor as soon as the feature was ratified by the NVMe working group, even before it was publicly released.
There were very few changes from the original proposal, I.e. opcodes, return values and now fancy names; the buffers (as described in my last post) are now called Shadow Doorbell and EventIdx buffers.
In short, the first one mimics the Doorbell registers in memory, allowing the emulated controller to fetch the Doorbell value when convenient instead of waiting for the Doorbell register to be written. For its part, the EventIdx provides a hint given by the emulated controller to tell the host if the Doorbell register needs to be updated (in case the emulated controller is not fetching the Doorbell value from the Shadow Doorbell buffer). You can check section 7.13 of the specification for an example of usage.
The following test results were obtained in a machine of type n1-standard-4 (4 vCPUs, 15 GB memory) at Google Cloud Engine platform with Kernel 4.12.0-rc5 using the following command:
$ sudo fio --time_based --name=benchmark --runtime=30 \ --filename=/dev/nvme0n1 --nrfiles=1 --ioengine=libaio --iodepth=32 \ --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=1 \ --rw=randread --blocksize=4k --randrepeat=0
Results (in Input/Ouput Operations per Second):
Without Shadow Doorbell and EventIdx buffers: 43.9K IOPS
With Shadow Doorbell and EventIdx buffers: 184K IOPS
Gain ~= 4 times
Screenshot - Without Shadow Doorbell and EventIdx buffers

Screenshot - With Shadow Doorbell and EventIdx buffers

Enjoy your enhanced numbers of IOPS! :D
23/03/2026
PanVK’s new framebuffer abstraction for Mali GPUs removes OpenGL-specific constraints, unlocking more flexible tiled rendering features…
02/03/2026
Get the recap of Nicolas Frattaroli's FOSDEM talk detailing Rockchip’s mainline progress, including Vulkan 1.4 and NPU support as a vital…
02/12/2025
As an active member of the freedesktop community, Collabora was busy at XDC 2025. Our graphics team delivered five talks, helped out in…
24/11/2025
LE Audio introduces a modern, low-power, low-latency Bluetooth® audio architecture that overcomes the limitations of classic Bluetooth®…
17/11/2025
Collabora’s long-term leadership in KernelCI has delivered a completely revamped architecture, new tooling, stronger infrastructure, and…
11/11/2025
Collabora extended the AdobeVFR dataset and trained a FasterViT-2 font recognition model on millions of samples. The result is a state-of-the-art…