*

NVMe: Officially faster for emulated controllers!

Posted on 13/06/2017 by Helen Koike

The Doorbell Buffer Config command

When I last wrote about NVMe, the feature to improve NVMe performance over emulated environments was just a living discussion and a work in progress patch. However, it has now been officially released in the NVMe Specification Revision 1.3 under the name "Doorbell Buffer Config command", along with an implementation that is already in the mainline Linux kernel! \o/

You can already feel the difference in performance if you compile Kernel 4.12-rc1 (or later) and run it over a virtual machine hosted on Google Compute Engine. Google actually updated their hypervisor as soon as the feature was ratified by the NVMe working group, even before it was publicly released.

There were very few changes from the original proposal, I.e. opcodes, return values and now fancy names; the buffers (as described in my last post) are now called Shadow Doorbell and EventIdx buffers.

In short, the first one mimics the Doorbell registers in memory, allowing the emulated controller to fetch the Doorbell value when convenient instead of waiting for the Doorbell register to be written. For its part, the EventIdx provides a hint given by the emulated controller to tell the host if the Doorbell register needs to be updated (in case the emulated controller is not fetching the Doorbell value from the Shadow Doorbell buffer). You can check section 7.13 of the specification for an example of usage.

Results

The following test results were obtained in a machine of type n1-standard-4 (4 vCPUs, 15 GB memory) at Google Cloud Engine platform with Kernel 4.12.0-rc5 using the following command:

$ sudo fio --time_based --name=benchmark --runtime=30 \ --filename=/dev/nvme0n1 --nrfiles=1 --ioengine=libaio --iodepth=32 \ --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=1 \ --rw=randread --blocksize=4k --randrepeat=0

Results (in Input/Ouput Operations per Second):
Without Shadow Doorbell and EventIdx buffers: 43.9K IOPS
With Shadow Doorbell and EventIdx buffers: 184K IOPS
Gain ~= 4 times

Screenshot - Without Shadow Doorbell and EventIdx buffers


Screenshot - With Shadow Doorbell and EventIdx buffers


Enjoy your enhanced numbers of IOPS! :D

 

Original post

Comments (0)


Add a Comment





Allowed tags: <b><i><br>Add a new comment:


Latest Blog Posts

Who knew we still had low-hanging fruit?

17/10/2017

Earlier this month I had the pleasure of attending the Web Engines Hackfest, hosted by Igalia at their offices in A Coruña, and also sponsored…

Performance analysis in Linux (continued)

06/10/2017

In this post, I will show one more example of how easy it is to disrupt performance of a modern CPU, and also run a quick discussion on…

XDC 2017 - Links to recorded presentations (videos)

23/09/2017

Many thanks to Google for recording all the XDC2017 talks. To make them easier to watch, here are direct links to each talk recorded at…

DebConf 17: Flatpak and Debian

17/08/2017

Last week, I attended DebConf 17 in Montréal, returning to DebConf for the first time in 10 years (last time was DebConf 7 in Edinburgh).…

Android: NXP i.MX6 on Etnaviv Update

24/07/2017

More progress is being made in the area of i.MX6, etnaviv and Android. Since the last post a lot work has gone into upstreaming and stabilizing…

vkmark: more than a Vulkan benchmark

18/07/2017

Ever since Vulkan was announced a few years ago, the idea of creating a Vulkan benchmarking tool in the spirit of glmark2 had been floating…

Open Since 2005

We use cookies on this website to ensure that you get the best experience. By continuing to use this website you are consenting to the use of these cookies. To find out more please follow this link.

Collabora Ltd © 2005-2017. All rights reserved. Website sitemap.