May 07, 2020
My colleague Julian blogged about PipeWire earlier this year, mentioning that at Collabora, as part of our work for Automotive Grade Linux, has been developing a PipeWire session manager called WirePlumber. In this post, I will attempt to explain a bit more about WirePlumber and give some context for future blog posts on this subject.
The main purpose of PipeWire is to act as an intermediate layer between applications and devices. To achieve this, it provides a generic way for applications to create media streams, which can then be directed to any device or other application for playback or capture. This functionality defines PipeWire as a stream exchange framework. Apart from providing a mechanism to create media streams, however, stream exchange also requires a mechanism to define who is exchanging data with whom. In other words, it needs a mechanism to decide which application is going to be connected to which device, how and when.
In traditional setups, applications have direct access to devices. This means they need to choose themselves the device they want to open and set it up according to their media requirements (i.e. choose an audio sample rate, a format, a video resolution, etc). While system configuration can exist to have a “system default” device (ex. in ALSA), in some setups this is not the case, burdening the application developer to provide a way to configure device selection. Furthermore, such setups do not allow transparent switching of devices (ex. switch audio playback from laptop speakers to a bluetooth headset while music is playing), unless the application implements the complex operations required to do so. In some cases, another issue is that devices are controlled exclusively by a single application, not allowing more complex use cases where sharing a device is required. Last but not least, accessing devices directly increases the complexity of the applications’ media pipelines in order to handle multiple device formats or deal with mis-behaving / non-standard devices.
PulseAudio has improved this situation significantly for audio applications. In PulseAudio, audio devices are opened and configured internally and audio applications can just create streams of any desired format and request to play or capture from the “default” device. Application developers no longer have to provide a means to configure which device to use, although they still can if they want to. PulseAudio maintains this “default” device preference internally and automatically creates the necessary internal links to make things work when a new stream comes in from an application. This default device preference can be changed at runtime and application streams can be transparently redurected to another device, abstracting away all complexity. The problem here, however, is that while this logic is great for most desktop applications, it does not scale well to other use cases. Also, PulseAudio does not handle video streams…
On the other side there is JACK, which deals with a specific use case as well: professional audio. JACK similarly allows applications to just create a stream and forget about the device. But unlike PulseAudio, it implements no connection logic internally. This is left to an external component: the session manager. The session manager watches for applications connecting or disconnecting and uses its own logic to link them to a device or a peer application. This may involve a “default” device target, but it normally follows a set of more complex user-configurable rules that allow flexibility in setting up the audio processing stage for professional audio applications. The problem here, however, is of course that JACK does not handle well the typical desktop use case and is complex to use for a non-professional.
Which brings us back to PipeWire… Combining parts of all these designs together, PipeWire provides a flexible media server that can be used to implement desktop, embedded, professional and non-professional use cases for both audio and video. To its best interest, PipeWire is also powered by a session manager, similar to the one in JACK, but with even more powers available.
PipeWire upstream has a very limited example session manager. It serves as a good example for building new ones and has some functionality there for basic desktop use cases and testing, but it goes no further than that. WirePlumber serves as a replacement for this example and additionally provides a framework for building custom session managers.
The main goal of WirePlumber as a session manager is obviously to watch for streams from applications and make sure that they get linked to the appropriate device or peer application according to the rules of the use case that it implements. However, unlike a JACK session manager, a PipeWire session manager has more responsibilities.
PipeWire itself actually does not open any devices when it starts. It provides components that can do that, but they are not loaded by default in the daemon. A main task of the session manager is to load these components, for the devices that it is interested in, and configure the devices appropriately.
This is reasonable to be part of the session manager, since the decision of which devices to probe and how to configure them is specific to the use case. A car’s audio hardware requires different configuration than a desktop’s sound card.
WirePlumber provides a module that deals with monitoring devices which works for all of PipeWire’s device monitor components that implement the spa_device interface. This includes ALSA, V4L2 and bluez5 monitors. Additionally, it provides a module that loads the special “JACK” device, which allows PipeWire to run as a client to the JACK audio server.
PipeWire takes security seriously and assumes by default that all applications are untrustworthy. Internally, it provides a permissions system similar to the one on UNIX filesystems, allowing to set read, write & execute (rwx) bits on all objects that a client can access through its IPC protocol. A client that does not have the required permissions to access an object cannot do anything malicious with it.
Another task of the session manager, therefore, is to authenticate clients and grant them the appropriate permissions. WirePlumber provides a module for that, although, at the time of writing this post, this module is dummy and does not do proper permissions management; it just grants all clients full access to all objects. There are plans to implement this properly for AGL and for the desktop, though, so stay tuned.
PipeWire internally represents the media flow using a graph of components that are called “nodes”, which are linked to one another. These are the purple and green boxes in the diagram above. Nodes abstract processing logic and provide a way for getting data in and out of PipeWire, delegating processing to clients or devices.
When managing this graph, it is often the case that several nodes need to be managed together as a single entity that provides more complex functionality. For instance, an audio DSP filter that operates on an audio device would be represented by a node that is linked directly to that audio device’s node. Applications that want their audio to pass through that filter should then have their nodes linked with the filter node instead of the device node. This increases complexity of whichever component is making the decision on where to link what, as it now needs to have specific knowledge about this filter’s operation. Additionally, this does not work well with configuration UIs like pavucontrol or GNOME’s sound settings, which are built around the concept that applications connect directly to devices with nothing in-between.
Another concern is that in modern systems streams are often associated with a use case. This is not visible on desktop systems so much, but think of your phone. Audio streams that deliver music are separated from audio streams that deliver notifications or alarm sounds and they come with separate volume controls and policy as to whether they are audible, whether they are emphasized (all other streams muted or ducked to a lower volume), etc… Similar properties apply to video streams, where, for instance, a camera feed that is meant for live preview on your screen has a different encoding and resolution than the feed that is meant for video recording and the feed that is meant for still image (photo) capture.
While it may not sound complex, associating streams with use cases can be very much so on embedded systems. In pure software, for example, the audio use cases implementation would be just a matter of categorizing application streams and adjusting their volume controls or their link status based on policy configuration. On embedded, however, it is common for all of this to be implemented on a dedicated hardware DSP that receives all the streams via different paths and applies all the mixing, volume alterations, effects and policy in hardware. Controlling the operation of this hardware, therefore, becomes specific to the device and that means that the session manager, on the CPU side, needs to present an abstraction layer for the policy configuration to work similarly on different devices.
All these problems are solved in WirePlumber by implementing certain objects that are called endpoints. Endpoints, just like nodes, are also linked to one another forming a graph. Each one of them represents a user-conceivable place where media can be routed to/from (such as a pair of speakers or a bluetooth headset’s microphone) and provides a set of endpoint streams, which represent logical paths that can be taken to reach this place, often associated with a use case.
The purpose of this endpoints graph (also called the “session management graph” in the documentation) is to provide a means of viewing the nodes graph from a higher-level perspective that involves use cases and targets that the user can understand. This allows writing policy and other configuration more easily, allowing the user to foget about device-specific details and focus on the actual user experience that this configuration will deliver.
WirePlumber constructs all endpoints using a module that is driven by user-configurable rules and has a modular system for loading system-specific endpoint providers. That system allows integrators to provide code that manages specific hardware, without having to re-implement a custom session manager from scratch.
Last but not least, WirePlumber provides a module that creates links between endpoints based on user-configurable policy rules. This is the main goal of it as the session manager. Unfortunately, the current way of configuring policy is not as flexible as we would like it to be, despite it being the second attempt in writing a policy management module. In the very near future, my plan is to experiment with lua-based scripts that will describe this policy. This subject will be discussed further in a future blog post, so I will keep it short here.
As you may have noticed, in all the above text about WirePlumber’s features I have mentioned that it provides “modules” that offer functionality. This is a key design aspect of WirePlumber. Every function is a module that builds upon a shared library with common functionality and interfaces that allow the modules to work together.
WirePlumber’s common library is based on GObject, which, among other things, allows implementing bindings to other languages easily. While current modules are all written in C, mechanisms exist to allow implementing them in different languages.
The idea behind all this is for WirePlumber to serve as a whole framework for building custom session managers for PipeWire. It is possible this way to replace functionality that already exists in some module or complement it with additional code. Combined with the modular and extensible nature of PipeWire itself, this can be a very powerful tool for adding custom functionality that goes beyond PipeWire’s original targets.
Following our recent presentation at OSSummit, many showed interest in learning more about solving real-world problems with computer vision.…
Recent work in Weston, the industry-standard Wayland compositor, has enabled DRM/KMS backends to be tested in the absence of real hardware,…
Initcalls, which serve to call functions during boot, were implemented early on in the development of the Linux Kernel. Read on as we take…
Earlier this year, we announced a new project with Microsoft: the implementation of OpenCL & OpenGL to DirectX translation layers. Here's…
Syzkaller is much needed tool for Linux kernel testing and debugging. With some work, it can also be enhanced to find bugs in specific drivers,…
Previously, we discussed about how Rust can be a great language for embedded programming. In this article, we'll explain an easy setup to…