We're hiring!

Asymmetric Multi Processing with Linux & Zephyr on the STM32MP1

Arnaud Ferraris avatar

Arnaud Ferraris
March 03, 2021

Share this post:

Reading time:

In the embedded world, more and more vendors offer Arm-based System-on-Chips (SoC) including both powerful Cortex-A CPU cores, designed to run a full-featured OS such as Linux, and one or more low-power Cortex-M cores, usually found in microcontrollers, designed to execute bare-metal or RTOS-based applications. With these designs, generally Linux runs on the main processor, with full networking, memory management and security capabilities with the coprocessor running a very specialized software, which completely depends on your system requirements:

  • real-time sensor processing
  • system management, keeping your device alive while Linux is in standby
  • system monitoring, able to reset or recover the whole system in case of failure

In this article, we'll take a closer look at the ST Microelectronics STM32MP1 microprocessor series, which contains 2 650-800MHz Cortex-A7 cores and a single Cortex-M4 clocked at 209MHz. They both have access to all the SoC peripherals, except for the memory: while the main processor uses external DDR RAM, the coprocessor can only use a dedicated 448kb SRAM. To allow communication between the 2 processors, this SoC also includes a mailbox and hardware semaphore which can be used by all cores to ensure exclusive access to peripherals and exchange information.

ST Microelectronics STM32MP1 discovery kit board

Zephyr, the other OS from the Linux Foundation

While the Linux kernel can run on a wide range of devices, it requires a decent amount of memory (> 4MB), and therefore cannot be used on memory-constrained microcontrollers.

Enters Zephyr, a project initiated by Wind River, now developed as a Linux Foundation project.

Zephyr is fully configurable and supports a large number of boards. It provides a lot of sample programs, and includes several useful libraries:

Zephyr is also easily extensible, as third-party modules and libraries (such as zscilib) can be added to the project with little configuration required.

Download Zephyr sources

On a Debian-based system (Debian buster or later), installing Zephyr is relatively straightforward. At the time of this writing, you can download Zephyr's build dependencies and source code by following these steps:

  • first install the base dependencies:
    sudo apt install --no-install-recommends git cmake ninja-build gperf \
      ccache dfu-util device-tree-compiler wget \
      python3-dev python3-pip python3-setuptools python3-tk python3-wheel xz-utils file \
      make gcc gcc-multilib g++-multilib libsdl2-dev
  • then install a ARMv7 toolchain:
    sudo apt install gcc-arm-none-eabi
  • then install west (Zephyr's meta-tool, used for both repository management -- similar to repo -- and as a build command):
    pip3 install --user -U west
    echo 'export PATH=~/.local/bin:"$PATH"' >> ~/.bashrc
    source ~/.bashrc
  • finally, pull the Zephyr source code:
    west init /directory/to/which/install/zephyr
    cd /directory/to/which/install/zephyr
    west update

Detailed instructions can be found in the Zephyr getting started guide.

IMPORTANT: make sure you use a recent version of Zephyr (>= v2.4.0) as older versions won't work with a mainline Linux kernel on STM32MP1.

Anatomy of a Zephyr program

Zephyr applications are much like any C program, with the main() function being the entry point but since microcontroller applications aren't supposed to exit main() usually consists of initialization code, followed by an infinite loop where all the work is done.

When looking at the blinky sources, we see that it's all quite straightforward:

  • the CMakeLists.txt file is used for compilation and specifies which application-specific files should be compiled in
  • prj.conf acts as an application-specific defconfig: each supported board has its own defconfig (much like what you'll find in the Linux kernel); config options required for the application to compile and run are added to prj.conf, which is internally appended to the board's defconfig when building the project
  • sample.yaml is used to define test cases for Zephyr's test runner.

The source code of the application itself is in the src/main.c file.

When looking at the code itself, it is pretty much self-explanatory although one interesting point is how device trees are used: unlike Linux, a binary will run on one platform only, so the device tree is parsed and interpreted at compile time which avoids unnecessary usage of the target's limited resources.

Macros (DT_*) are used to fetch specific device properties and functions such as device_get_binding() are used to retrieve the compiled-in device information.

In blinky's case, we look for the led0 device-tree alias, which should point to the LED to be used, as you will see in the STM32MP157C-DK2 device-tree: led0 is an alias for red_led_1, which is an LED connected to pin 7 of GPIO port H, active high.

Build a sample for STM32MP157C-DK2

From the Zephyr directory, clone the sample application:

git clone https://gitlab.collabora.com/aferraris/zephyr-rpmsg-demo.git

This demo application showcases how sensor processing (here, a MPU6050 6-axis gyroscope + accelerometer) can be offloaded to the coprocessor while still making the data available to the Linux system. It runs 2 threads:

  • the first one (defined in imu.c) is dedicated to the sensor: this task calibrates the MPU6050 then fetches and processes measurement from the sensor every 10 milliseconds
  • the other thread (ipc.c) is used for communicating with the Linux system, using the OpenAMP library; this one is heavily based on the openamp_rsc_table Zephyr sample

Finally, the project also includes a boards folder containing a device-tree overlay for the STM32MP157C-DK2, defining the following elements:

  • aliases for the mailbox and shared SRAM area used for inter-processor communication
  • I2C subnode for the sensor

The communication is a simple ASCII-encoded request-response protocol where all communication is initiated on the Linux side:

  • fetch requests the current sensor orientation from the coprocessor; the reply is in the form X=<x_orientation>;Y=<y_orientation>;Z=<z_orientation> (values in degrees, the reference being the initial position)
  • quit tells the coprocessor to end the IPC task; while not very useful for this application, it demonstrates the ability to stop a thread while keeping the other tasks running

In order to build the application, setup the build environment first, so that Zephyr knows its path and which toolchain to use:

. zephyr/zephyr-env.sh
export ZEPHYR_TOOLCHAIN_VARIANT=cross-compile
export CROSS_COMPILE=/usr/bin/arm-none-eabi-

Then build the application:

west build --board stm32mp157c_dk2 zephyr-rpmsg-demo

This will produce a file named build/zephyr/zephyr-rpmsg-demo.elf, which is the executable file to be loaded on the coprocessor.

Connect the MPU6050

The MPU6050 is connected to port I2C5 of the STM32MP157C, which is available on the 40-pin GPIO connector CN2:

  • 3V3 is pin 1
  • SDA is pin 3
  • SCL is pin 5
  • GND is pin 6
  • pin 7 (GPIO4) could be wired to the interrupt pin of the MPU6050 (unused in this application)

You could also use the Arduino headers on the other side of the board:

  • SDA is pin D14 (connector CN13)
  • SCL is pin D15 (connector CN13)
  • GND and 3V3 are both available on connector CN16

Run the firmware

The Zephyr application could easily be loaded and started when booting the system using u-boot, following the instructions in u-boot's documentation (look for the Coprocessor firmware paragraph). This can be useful when e.g. you want the coprocessor to monitor the Linux system.

However by doing this we would give away control of the coprocessor, while it can be interesting to fully manage it from the Linux system running on the A7 cores.

The remoteproc framework and RPMsg

remoteproc is a Linux kernel framework aimed at providing the basic infrastructure for loading firmware to a remote processor, as well as powering it on and off.

rpmsg is a communication framework, based on virtio, allowing Linux drivers to communicate with a remote processor by abstracting low-level implementation, allowing the clients to focus on the message payloads.

remoteproc relies on device-specific drivers and creates the virtio devices necessary for rpmsg to work. It also provides a sysfs interface for controlling the coprocessor and mostly works out of the box once properly configured.

rpmsg, however, requires writing a client kernel driver in order to implement the chosen communication protocol and expose high-level controls (such as a /dev node or a sysfs interface) to either other drivers or userspace programs.

Configuring the kernel

Using a mainline Linux kernel (we recommend using version 5.9 or above), the following options have to be set in order to enable remoteproc and rpmsg support:


On the STM32MP1 processor series, you will also need the following device driver:


Note: these options are already enabled when using the mainline stm32mp15_defconfig.

Building the RPMsg client module

The RPMsg client for this demo application is available here.

It can be built on the target device by executing the following commands:

$ git clone https://gitlab.collabora.com/aferraris/zephyr-rpmsg-client.git
$ cd zephyr-rpmsg-client
$ make -C /lib/modules/`uname -r`/build M=$PWD
$ sudo make -C /lib/modules/`uname -r`/build M=$PWD modules_install

If cross-compiling from a workstation, simply replace /lib/modules/`uname -r`/build with the kernel build directory and add the proper ARCH= and CROSS_COMPILE= directives to the build command-line.

Running the firmware

Once the kernel is properly configured and the system is running, you must first copy the coprocessor binary file (in our case, the zephyr-rpmsg-demo.elf file) to the /lib/firmware directory on the target board.

By default, on STM32MP1, Linux expects the firmware to be named rproc-m4-fw. You can therefore either rename your firmware file to match the default name, or instruct the kernel to load the zephyr-rpmsg-demo.elf file by executing the following command:

# echo zephyr-rpmsg-demo.elf > /sys/class/remoteproc/remoteproc0/firmware

The coprocessor can then be controlled by writing to /sys/class/remoteproc/remoteproc0/state:

  • load the firmware and start the coprocessor:
    # echo start > /sys/class/remoteproc/remoteproc0/state
  • stop the coprocessor:
    # echo stop > /sys/class/remoteproc/remoteproc0/state

You can check the current sensor orientation by reading the /sys/bus/rpmsg/drivers/rpmsg_zephyr_client/values file.


Many modern SoCs such as the STM32MP1 now include coprocessor cores which can be used for a wide range of tasks and can offload some of the work from the main processor. Using Zephyr alongside Linux can be a simple and efficient way to take advantage of these additional cores and opens a new world of possibilities.

If you have questions or need assistance regarding asymmetric multi-core processing with Linux and Zephyr on the STM32MP1 or any other Arm based SoCs, please get in touch!

Comments (2)

  1. Rasit Eskicioglu:
    Mar 31, 2021 at 09:51 PM

    Hi Arnoud,

    It seems that STM32MP157C-DK2 obsolete on Digi Key. Is there a new DK that this work is ported on?


    Reply to this comment

    Reply to this comment

    1. Arnaud Ferraris:
      May 31, 2021 at 01:38 PM

      Hi Rasit,

      This hasn't been tested with other STM32MP1 development boards, but I believe it could work as-is, or with maybe minimal changes, on both the STM32MP157D-DK1 & STM32MP157F-DK2.


      Reply to this comment

      Reply to this comment

Add a Comment

Allowed tags: <b><i><br>Add a new comment:

Search the newsroom

Latest Blog Posts

The state of GFX virtualization using virglrenderer


With VirGL, Venus, and vDRM, virglrenderer offers three different approaches to obtain access to accelerated GFX in a virtual machine. Here…

Faster inference: torch.compile vs TensorRT


In the world of deep learning optimization, two powerful tools stand out: torch.compile, PyTorch’s just-in-time (JIT) compiler, and NVIDIA’s…

Mesa CI and the power of pre-merge testing


Having multiple developers work on pre-merge testing distributes the process and ensures that every contribution is rigorously tested before…

A shifty tale about unit testing with Maxwell, NVK's backend compiler


After rigorous debugging, a new unit testing framework was added to the backend compiler for NVK. This is a walkthrough of the steps taken…

A journey towards reliable testing in the Linux Kernel


We're reflecting on the steps taken as we continually seek to improve Linux kernel integration. This will include more detail about the…

Building a Board Farm for Embedded World


With each board running a mainline-first Linux software stack and tested in a CI loop with the LAVA test framework, the Farm showcased Collabora's…

Open Since 2005 logo

Our website only uses a strictly necessary session cookie provided by our CMS system. To find out more please follow this link.

Collabora Limited © 2005-2025. All rights reserved. Privacy Notice. Sitemap.