We're hiring!
*

Crosswalk and its JS, JAVA and Native Extensions performance

Danilo Cesar avatar

Danilo Cesar
February 05, 2015

Share this post:

Reading time:

I had a lot of fun recently playing with Crosswalk. I was basically testing it’s features and extensions when it hit me: I know it’s possible to write Native extensions for crosswalk Tizen, but is it possible to write Native Extensions for Crosswalk-android? How would it behave?

Crosswalk android already provides a way to load Java extensions, and the Android-NDK provides a way to call a native library from a Java method. We just have to glue those things together.

Disclaimer:
I won’t talk here about how to build a library with NDK or how to write a regular extension with Crosswalk, as those topics are already extensively covered over the Internet. I assume you already have an library.so and a working crosswalk-android extension from the Tutorials.

Adding the .so inside your apk

Crosswalk-samples includes an example application named “extensions-android“. This application builds the extension code first using an “ant” recipe and then builds the html application itself and put both inside an apk file.

By the end of the extension build process, the ant recipe opens the extension.jar file to include the META-INF information inside that file. It’s the perfect time to inject our native library inside the package.

We only have to change the build.xml recipe to include the library.

<copy file="libs/armeabi/libMyLibrary.so" todir="${build}/lib/armeabi-v7a/" />
Be sure to do it between  <unjar> and <jar> as we’re taking advantage of the fact that the extension jar is being opened and closed to include the META-INF.

Performance measurement

To make a performance test I used an algorithm to find the nth prime number. The approach is the simplest (no pre-built arrays, no recursion) and dumbest (no short-cuts, no AKS, etc) one, and share the same implementation. The implementation is the following:

 


function findPrime(n)
{
    var i, count = 0;
    for (i = 1; count < n; i++) {
        if (isPrime(i))
            count++;

        if (count == n)
            break;
    }   
    return i;
}

function isPrime(n)
{
    var i = 0;
    if (n == 1 || n == 2)
        return true;

    for (i = 2; i < n; i++) {
        if (n % i == 0) {
            break;
        }   
    }   

    return (n == i); 
}

Converting it to JAVA or C would be ridiculous easy.

 

Performance Results

I first tried to find the 10th prime number and the result is the following:

amChart10prime

This was quite expected, right? I mean, the “context switch” between JS->JAVA or JS->JAVA->C has a price to be paid, there’s the serialization/deserialization process that takes some time to finish, and since the core operation is not really time consuming (a short iteration between 1 and 29, checking if they’re prime numbers) I can imagine why JAVA and C took more time. Let’s try again and find the 100th prime:

amChart100prime

OK, It’s a bit odd, the loading time is different from execution to execution and it affects the first time the code is executed. Ignoring the higher (usually the first execution) and lower values (got lucky?) give us a more sane result, but still… What about the 10Kth prime number?

amChart10kprime

It doesn’t make any sense, JS is faster?! What’s going on?

Divisions/Modulus are expensive. Even if I build the Native Library with armeabi-v7a it gets 30% better and goes bellow 10 seconds, almost the same performance accomplished by JAVA but still worse than Javascript.

It happens that Modulus operation can be improved, and I believe that the V8 is smart enough to use some of those techniques. So, looking for primes might not be a fair way to measure performance.

Performance Results, round two

A second approach would be to avoid divisions and use only plain additions. Maybe just sum the sequence 1+2..+N, repeating it N times?
Maybe 1-2+3-4…N to avoid overflow?

 


function sum(n) {
    var signal;
    var i, j;
    var total = 0;
    var r1;
    var start = n / 2;

    for (j = 0; j <= n; j++) {
        r1 = start;
        signal = -1;
        for (i = 0; i<= n; i++) {
            r1 = r1 + (i * signal);
            signal = -signal;
        }
        total = total + r1;
    }
    return total;
}
As the final value is -n/2, I’m starting the sum with n/2 to get 0 as the result.

 

For a small sum of ten elements:
amChart10sum

Makes sense, right? The context switch costs more than the sum of the 10 elements directly on javascript. Let’s see the 50K results:

amChart50ksum
As you can see here, without space for smart optimizations on the JS side, the native results are 3x better than the javascript. Nice!

Conclusion

There’re a few lessons we learned here:

1 – The cost of going from JS->JAVA or JS->JAVA->C cannot be ignored. It was a bit less than 1ms (from the 10 additions chart). Think about that before putting a native call inside a huge loop.
2 – Consider measuring your optimizations before accepting it in your source code. Sometimes things are not obvious.

Links

Original post

Related Posts

Related Posts

Comments (2)

  1. Salvatore Iovene:
    Jun 08, 2015 at 01:52 PM

    Hi (ciao?),
    I did as you suggest here, i.e. add the .so files from a third party library to /lib/armeabi-v7a/ in the APK, but when I reference that library the extension failes to load. Anything else missing? Thanks!

    Reply to this comment

    Reply to this comment

  2. Danilo Cesar Lemes de Paula:
    Jun 10, 2015 at 02:04 PM

    Hey there.

    Can you see anything useful on adb logcat? It usually says that an expected library fails to load.
    It might be the case that your library is not expected to be under /lib/armeabi-v7a/, it might be somewhere else.

    I got that path from adb logcat, but I tested it with an armeabi-v7a device, so it might be another path for yours device.

    Reply to this comment

    Reply to this comment


Add a Comment






Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest Blog Posts

Automatic regression handling and reporting for the Linux Kernel

14/03/2024

In continuation with our series about Kernel Integration we'll go into more detail about how regression detection, processing, and tracking…

Almost a fully open-source boot chain for Rockchip's RK3588!

21/02/2024

Now included in our Debian images & available via our GitLab, you can build a complete, working BL31 (Boot Loader stage 3.1), and replace…

What's the latest with WirePlumber?

19/02/2024

Back in 2022, after a series of issues were found in its design, I made the call to rework some of WirePlumber's fundamentals in order to…

DRM-CI: A GitLab-CI pipeline for Linux kernel testing

08/02/2024

Continuing our Kernel Integration series, we're excited to introduce DRM-CI, a groundbreaking solution that enables developers to test their…

Persian Rug, Part 4 - The limitations of proxies

23/01/2024

This is the fourth and final part in a series on persian-rug, a Rust crate for interconnected objects. We've touched on the two big limitations:…

How to share code between Vulkan and Gallium

16/01/2024

One of the key high-level challenges of building Mesa drivers these days is figuring out how to best share code between a Vulkan driver…

Open Since 2005 logo

We use cookies on this website to ensure that you get the best experience. By continuing to use this website you are consenting to the use of these cookies. To find out more please follow this link.

Collabora Ltd © 2005-2024. All rights reserved. Privacy Notice. Sitemap.