We're hiring!
*

Weston repaint scheduling

Pekka Paalanen avatar

Pekka Paalanen
February 11, 2015

Share this post:

Reading time:

Now that Presentation feedback has finally landed in Weston (feedback, flags), people are starting to pay attention to the output timings as now you can better measure them. I have seen a couple of complaints already that Weston has an extra frame of latency, and this is true. I also have a patch series to fix it that I am going to propose.

To explain how the patch series affects Weston's repaint loop, I made some JSON-timeline recordings before and after, and produced some graphs with Wesgr. Here I will explain how the repaint loop works timing-wise.

The old algorithm

The old repaint scheduling algorithm in Weston repaints immediately on receiving the pageflip completion event. This maximizes the time available for the compositor itself to repaint, but it also means that clients can never hit the very next vblank / pageflip.

Figure 1. The old algorithm, the client paints as response to frame callbacks.


Frame callback events are sent at the "post repaint" step. This gives clients almost a full frame's time to draw and send their content before the compositor goes to "begin repaint" again. In Figure 1. you see, that if a client paints extremely fast, the latency to screen is almost two frame periods. The frame latency can never be less than one frame period, because the compositor samples the surface contents (the "repaint flush" point) immediately after the previous vblank.

Figure 2. The old algorithm, the client paints as response to Presentation feedback events.


While frame callback driven clients still get to the full frame rate, the situation is worse if the client painting is driven by presentation_feedback.presented events. The intent is to draw and show a new frame as soon as the old frame was shown. Because Weston starts repaint immediately on the pageflip completion, which is essentially the same time when Presentation feedback is sent, the client cannot hit the repaint of this frame and gets postponed to the next. This is the same two frame latency as with frame callbacks, but here the framerate is halved because the client waits for the frame to be actually shown before continuing, as is evident in Figure 2.

Figure 3. The old algorithm, client posts a frame while the compositor is idle.


Figure 3. shows a less relevant case, where the compositor is idle while a client posts a new frame ("damage commit"). When the compositor is idle graphics-wise (the gray background in the figure), it is not repainting continuously according to the output scanout cycle. To start painting again, Weston waits for an extra vblank first, then repaints, and then the new frame is shown on the next vblank. This is also a 1-2 frame period latency, but it is unrelated to the other two cases, and is not changed by the patches.

The modification to the algorithm

The modification is simple, yet perhaps counter-intuitive at first. We reduce the latency by adding a delay. The "delay before repaint" is in all the figures, and the old algorithm is essentially using a zero delay. The compositor's repaint is delayed so that clients have a chance to post a new frame before the compositor samples the surface contents.

A good amount of delay is a hard question. Too small delay and clients do not have time to act. Too long delay and the compositor itself will be in danger of missing the vblank deadline. I do not know what a good amount is or how to derive it, so I just made it configurable. You can set the repaint window length in milliseconds in weston.ini. The repaint window is the time from starting repaint to the deadline, so the delay is the frame period minus the repaint window. If the repaint window is too long for a frame period, the algorithm will reduce to the old behaviour.

The new algorithm

The following figures are made with a 60 Hz refresh and a 7 millisecond repaint window.

Figure 4. The new algorithm, the client paints as response to frame callback.


When a client paints as response to the frame callback (Figure 4), it still has a whole frame period of time to paint and post the frame. The total latency to screen is a little shorter now, by the length of the delay before compositor's repaint. It is a slight improvement.

Figure 5. The new algorithm, the client paints as response to Presentation feedback.


A significant improvement can be seen in Figure 5. A client that uses the Presentation extension to wait for a frame to be actually shown before painting again is now able to reach the full output frame rate. It just needs to paint and post a new frame during the delay before compositor's repaint. This mode of operation provides the shortest possible latency to screen as the client is able to target the very next vblank. The latency is below one frame period if the deadlines are met.

Discussion

This is a relatively simple change that should reduce display latency, but analyzing how exactly it affects things is not trivial. That is why Wesgr was born.

This change does not really allow clients to wait some additional time before painting to reduce the latency even more, because nothing tells clients when the compositor will repaint exactly. The risk of missing an unknown deadline grows the later a client paints. Would knowing the deadline have practical applications? I'm not sure.

These figures also show the difference between the frame callback and Presentation feedback. When a client's repaint loop is driven by frame callbacks, it maximizes the time available for repainting, which reduces the possibility to miss the deadline. If a client drives its repaint loop by Presentation feedback events, it minimizes the display latency at the cost of increased risk of missing the deadline.

All the above ignores a few things. First, we assume that the time of display is the point of vblank which starts to scan out the new frame. Scanning out a frame actually takes most of the frame period, it's not instantaneous. Going deeper, updating the framebuffer during scanout period instead of vblank could allow reducing latency even more, but the matter becomes complicated and even somewhat subjective. I hear some people prefer tearing to reduce the latency further. Second, we also ignore any fencing issues that might come up in practise. If a client submits a GPU job that takes a long while, there is a good chance it will cause everything to miss a deadline or more.

As usual, this work and most of the development of JSON-timeline and Wesgr were sponsored by Collabora.

PS. Latency and timing issues are nothing new. Owen Taylor has several excellent posts on related subjects in his blog.

Original post

Related Posts

Related Posts

Comments (0)


Add a Comment






Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest Blog Posts

Automatic regression handling and reporting for the Linux Kernel

14/03/2024

In continuation with our series about Kernel Integration we'll go into more detail about how regression detection, processing, and tracking…

Almost a fully open-source boot chain for Rockchip's RK3588!

21/02/2024

Now included in our Debian images & available via our GitLab, you can build a complete, working BL31 (Boot Loader stage 3.1), and replace…

What's the latest with WirePlumber?

19/02/2024

Back in 2022, after a series of issues were found in its design, I made the call to rework some of WirePlumber's fundamentals in order to…

DRM-CI: A GitLab-CI pipeline for Linux kernel testing

08/02/2024

Continuing our Kernel Integration series, we're excited to introduce DRM-CI, a groundbreaking solution that enables developers to test their…

Persian Rug, Part 4 - The limitations of proxies

23/01/2024

This is the fourth and final part in a series on persian-rug, a Rust crate for interconnected objects. We've touched on the two big limitations:…

How to share code between Vulkan and Gallium

16/01/2024

One of the key high-level challenges of building Mesa drivers these days is figuring out how to best share code between a Vulkan driver…

Open Since 2005 logo

We use cookies on this website to ensure that you get the best experience. By continuing to use this website you are consenting to the use of these cookies. To find out more please follow this link.

Collabora Ltd © 2005-2024. All rights reserved. Privacy Notice. Sitemap.