We're hiring!
*

Alex's work experience at Collabora

Alex Shtyrov avatar

Alex Shtyrov
August 06, 2015

Share this post:

Reading time:

Work experience at Collabora was, of course, an interesting experience, to say the least. I, as a 14-year-old work experience student there, was exposed to many new ideas, concepts and experiences at Collabora, and it helped me assess the realities of software engineering, both good and bad (though mostly good).

During my stay, I worked on testing Evolution Data Server (EDS), an open source project which provides a system for unifying contacts, and is a back end for the Evolution application on GNOME. I focused mainly on the vCard parsing capabilities of EDS. vCard is a common electronic business card format, consisting of a set of key-value pairs. The keys (known as attributes) can also be accompanied by parameters. Admittedly, the project posed a challenge to me, as I had never before worked with open source software, or, in fact, any large software project.

Day 1 went fairly smoothly, as far as first days go - induction talk, meeting the staff, learning to use the coffee machine, safety talk, being introduced to the project. Yes, there were issues with my e-mail account, and, naturally, JHBuild, which I was using to build EDS, failed to install the correct dependencies. This meant 2 painful hours of manually installing the countless dependencies of EDS. But I was learning, and that was the main thing.

With the set-up stage complete, Day 2 was spent planning, which was mostly an exercise in specification reading (RFC 6350, the vCard specification, dominated). Then, I wrote a set of acceptance criteria and split the project up into tasks, which were uploaded to Phabricator. All of this was completely alien to me, however, with the help of my co-workers, the documents were swiftly completed.

Development started in earnest on Day 3. The plan was to split the project into 2 parts - a set of vCards, both valid and invalid, and a small program which would pass the vCards to EDS. The program, which was written in C, was fairly simple, with vCards being read into a string using the GLib g_file_get_contents() function. This string was then passed to the EDS e_vcard_new_from_string() function, which returns an EVCard object, which represents a vCard. Initially, we expected the parser to simply return NULL or an empty EVCard object if it was passed an invalid vCard, but it turned out that the parser is, in fact, very fault-tolerant and always tries to return something. This made the task somewhat harder, so I was told concentrate on trying to make EDS crash, hang, or exhibit weird write activity which could potentially be exploited.

Since my project was an educational one, it made sense to try out a range of testing techniques, all of which were aimed at generating input which would make EDS crash, hang or perform an invalid write operation. First of all, my efforts manifested themselves in a simple fuzzer script, written in Python. This script, when passed a file, would select a random section of the file of size 10% and replace it with random characters. Unfortunately, this script failed to generate any useful results (apart from a few memory leaks, but more on that later).

On Day 5, I was demonstrated the capabilities American Fuzzy Lop (rather unhelpfully also the name of a breed of rabbit, as I found out upon an attempt to Google it), an advanced fuzz testing tool. It ran over the weekend and, as my much simpler fuzzer script, failed to generate any useful results, by which I was greatly saddened. Trawling through bug reports was likewise of little use.

A more exciting outcome came from abnfgen, a tool which uses the ABNF (Augmented Backus-Naur form, a metalanguage used in RFC specifications) in the vCard specification to generate lots of vCards, which should be valid. I generated 1000 vCards using abnfgen, and found a problem almost instantly - the parser hung upon being passed one of the vCards. Now, all that was left to do was to find where in the several thousand lines of code the issue lay. The GNU Debugger (GDB) helped pinpoint the approximate location of the issue, but after that it was really a matter of sticking print statements (or g_printerr statements, to be precise) absolutely everywhere. Eventually, I found that EDS got stuck in an infinite loop if it was passed a vCard with 2 consecutive commas in the list of parameters.

Another bug, caused by incorrect handling of multibyte characters, was also highlighted to me. As some parts of the parser confused bytes and characters, taking the next 2 bytes rather than the next 2 Unicode characters, this could also cause some serious issues. The fix, however, was fairly simple.

Towards the end of my placement, I had run out of leads to follow, so resorted to diagnosing memory leaks. As C is not a language with garbage collection, this meant stepping through the code and checking that every memory allocation was always paired with a statement to free the memory. Although I was aided by the indispensable Valgrind and my even more indispensable co-workers, memory leaks have, unfortunately, permanently stained my image of software engineering. Nonetheless, at the end of the 2 weeks, my overall impression of the job was very positive, and I can confidently say that I have enjoyed my placement.

The none-programming side of things also went fairly well, especially after the eccentricities of the coffee machine had settled down in my head. Working in an office where the majority of communication happens in cyberspace meant it was tranquil and work efficiency was high, though the experience was certainly something I had not encountered before.

Finally, I would like to thank all the staff at Collabora for being so helpful, but I would like to express my gratitude in particular to Philip, Vivek, and Emmanuel, who aided me immeasurably during my placement. I have learned a lot thanks to them.

Related Posts

Related Posts

Comments (0)


Add a Comment






Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest Blog Posts

Automatic regression handling and reporting for the Linux Kernel

14/03/2024

In continuation with our series about Kernel Integration we'll go into more detail about how regression detection, processing, and tracking…

Almost a fully open-source boot chain for Rockchip's RK3588!

21/02/2024

Now included in our Debian images & available via our GitLab, you can build a complete, working BL31 (Boot Loader stage 3.1), and replace…

What's the latest with WirePlumber?

19/02/2024

Back in 2022, after a series of issues were found in its design, I made the call to rework some of WirePlumber's fundamentals in order to…

DRM-CI: A GitLab-CI pipeline for Linux kernel testing

08/02/2024

Continuing our Kernel Integration series, we're excited to introduce DRM-CI, a groundbreaking solution that enables developers to test their…

Persian Rug, Part 4 - The limitations of proxies

23/01/2024

This is the fourth and final part in a series on persian-rug, a Rust crate for interconnected objects. We've touched on the two big limitations:…

How to share code between Vulkan and Gallium

16/01/2024

One of the key high-level challenges of building Mesa drivers these days is figuring out how to best share code between a Vulkan driver…

Open Since 2005 logo

We use cookies on this website to ensure that you get the best experience. By continuing to use this website you are consenting to the use of these cookies. To find out more please follow this link.

Collabora Ltd © 2005-2024. All rights reserved. Privacy Notice. Sitemap.