Find Your Voice

Gamers run wild in the vast meadows of GPUs and hardcore computers. Being an occasional heavy gamer myself, I tremendously relate to the desire to push the boundaries of maximum framerates and ultra graphics quality.

Where am I going with this? And what does people spending their free time have to do with my open source project?

Turns out, a lot.

The Siren’s Lure

Without delving into computing history, all I will say is that gamers pushed processor manufacturers to improve on their graphics cards, often exponentially. Afterwards, it turned out that graphics cards were really good for training deep learning models due to the way they handled their memory structures and parallelized their computation. Like most things, the producers of this hardware discovered an entire field of computing where they could apply this – machine learning.

Software models developed, white papers were written, frameworks taking advantage of half-float calculations were delivered, and developers made their own Jarvis clones. Soon organizations like Mozilla developed programs that could help collect datasets for people looking to do research or test out an ML model over the weekend (what else is there to do?).

Mozilla quickly gathered verified voice samples with over 22GB of audio files through their open source project. Combined with an implementation of Baidu’s DeepSpeech, Mozilla offered a fully-featured model that could be used to train over thousands of voices in a field where innovation occurs daily.

This is where my story begins.

Testing the Waters

If I pay heed to all the open source books with carefully crafted tenets or read through blogs about how to be successful in open source that are floating around, I will walk away with the impression that if I have not used a service, it is more difficult (both technically and emotionally) contribute to that service. I understand why that is appropriate since there is nothing for me to give back. Thankfully, I didn’t have to worry about not being motivated for my next project. I used the service in both English and French. Although I have – what some call – an interesting accent, I did not contribute my voice too much to the project. However, I graciously verified loads of samples where I would be amused by the background sounds of people’s recordings. Slightly creepy? Perhaps. But, this lent me an idea of the user interface and some of the mild irritations I felt during my navigation of the website.

Strike a Chord

If it’s not immediately evident from my previous post, I detested my first selection of open source project to work on. If Common Voice is a siren’s melodious tune, Nextcloud Server was Hades’s Screech Owl. Satisfied with my selection, I roamed about in the project’s GitHub repository. After making my own fork and cloning it into my local machine, I started looking for bugs I could fix. Now, the entire process was very lightweight and I quickly realized that I would need to build the fully-fledged development environment.

So, I started to do just that.

Hook, Line and Docker

The Project was nicely designed with a proper Mozilla Public License (MPL) and a sufficiently designed CONTRIBUTING.md file. I followed the steps as strictly as I could and only needed to install a two extra pieces of software. Since I had already used my computer for web development, most of the software needed for programming was already present.

After installing MariaDB and Docker on my Macbook, I cloned the repo and issued the docker-compose up command per the instructions.

Never had I seen my battery deplete so quickly. The fans revved vigorously, the heat rose steadily, and the script’s outputs scrolled by as fast as they came. After it had set up an entirely new docker container, it attempted to run a node engine to start the different protocols for communicating with the database and display the front-end. My anxiety rose with my machine’s temperature and I realized that I, a mere mortal, did not have enough RAM to produce something like this.

Even though it built all the scripts, it could not produce a proper output; instead, it spat out an error: ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information. “I should report it,” I thought to myself. Then I talked myself out of it seeing how it could just be my Macbook Pro with it’s paltry CPU and a measly 8GB of RAM that held the application back.

Bait and Switch

I could not give up. How could I? Even if I failed, I needed to know why I failed. Open source, if anything, is about the culture of constant failure and putting up with irritating build methods and incomplete documentation. Saddened that my Macbook could not take it, I decided to switch to my Ubuntu Machine. A beast.

Problem is, I need to set up more tools than before. I had to struggle with installing Docker. MariaDB was as easy as apt install mariadb, but Docker was being a pain in a certain part of the body. After installing from the official snap store failed, I had to search for documentation on how to (pseudo)manually install docker. Thankfully, DigitalOcean’s detailed article came to the rescue. Before I mechanically went through the list of commands, I made sure to time myself until the entire process was ever.

16 minutes 41 seconds. That is how long it took from typing the first command to having it run flawlessly.

Swim

Now that the project was built, I could peacefully look at the issues on the repository that needed help. Most of the issues were on the front-end that had to do with the User Interface and/or User Experience. However, I did notice some technical issues that merits a deeper dive. Making things more interesting, the repository uses Typescript which is a statically typed superset of JavaScript and exotic to my eyes. If anything, this adds to the depth of learning this project could offer.

What issues did I choose? How did the community react to me? Find out in the coming weeks.

Written before or on March 21, 2019