Bismuth: Visualizing a year of open source development

It’s that time of the year, when you look back at what was done during the past year. Of course, there are some important events you remember pretty well, and just looking at the “completed” section of the Bismuth Roadmap shows how much was achieved in just 12 months.

https://github.com/bismuthfoundation/Roadmap#completed

What may be less obvious is the number of involved people and their daily activity.

Gource, dataviz tool for git coders

Hopefully, some other open source tools can give some nice insight into a fraction on what the team did. Namely, the github commits.

Gource apt install gource on linux flavors, specifically targets that, and produces nice looking videos. See https://gource.io for more info.

Since a user asked for such a dataviz for Bismuth, we made one, and here is the tweaks we used.

Collect the right logs

Bismuth development spreads over many repositories. Core dev is under the founder - hclivess - repo, and the rest usually takes place under the BismuthFoundation organisation.

Hopefully, Gource can process the log from many different directories, and you just have to concat them.

First step is then to collect and update all the git repo you want to graph, and extract every log.

I used a bash script to extract every log, like:
#!/bin/bash gource –output-custom-log Bismuth.txt Bismuth gource
–output-custom-log BismuthProjects.txt BismuthProjects gource
–output-custom-log BismuthToolsWeb.txt BismuthToolsWeb gource
–output-custom-log hypernode.txt hypernode

We can then combine the logs into a single file, sorting by timestamp, that are in the first column of the log files.
Gource doc gives an example command: cat *.txt | sort -n > combined.log

Process and filter the logs

Since some users may have several local usernames, it can be useful to rewrite them to a single username. This is easily done with an inplace search and replace, for instance:

sed -i -r “s#Jan Kučera#HCLivess#” combined.log

does replace all occurences of “Jan Kučera” by “HCLivess”.

Now, other log lines just add noise and are not needed, like build artifacts (.pyc or .pyd), images or css files, static dir… They are equally easy to remove, with a single grep command:

grep -vE “(static|\.png|\.min\.js|\.pyc|\.pyd|nuitka|\.css|\.pem|\.db|\.zip|jpg|jpeg|pycache|exe|json)” combined.log > combined2.log

We now have a “combined2.log” file with only the relevant info we want to graph.

View the dataviz

Gource can take a few parameters to tune the display. I ran it with:
gource combined2.log –logo 256×256.png –seconds-per-day 0.2 –auto-skip-seconds 0.2 –start-date “2017-11-01”

seconds-per-day and auto-skip-seconds tune the speed ratio of the video. With these settings, the whole year last about 2 minutes. Fast enough not to be boring, slow enough so we can follow some things.

start-date begins the dataviz with the given date, and not from the logs begin.

Export to video

Gource doc give useful info to also tune the video size and quality. A huge quality - and huge filesize - video can be exported with

gource –logo 256×256.png –seconds-per-day 0.2 –auto-skip-seconds 0.2
–start-date “2017-11-01” -1920×1080 combined2.log -o – |
ffmpeg -y -r 60 -f image2pipe -vcodec ppm -i – -vcodec libx264 -preset ultrafast
-pix_fmt yuv420p -crf 1 -threads 0 -bf 0 bismuth.mp4
ffmpeg -y -r 50 -f image2pipe -vcodec ppm -i bismuth.ppm -vcodec libx264
-preset slow -pix_fmt yuv420p -crf 5 -threads 0 -bf 0 bismuth.mp4

I was able to get a very small file (122Mo) with the following:

gource –logo 128×128.png –seconds-per-day 0.2 –auto-skip-seconds 0.2
–start-date “2017-11-01” -1080×720 combined2.log -o bismuth.ppm
#ffmpeg -y -r 29 -f image2pipe -vcodec ppm -i bismuth.ppm -vcodec
libx264 -preset veryslow -pix_fmt yuv420p -crf 15 -threads 0 -bf 0 bismuth.mp4

More options are available, for instance we could use real avatars, but the current video is good enough to show how busy the dev team was in 2018!

The resulting video

The video only shows the files that were added, deleted, or modified during the watch period. I choose to “merge” all repos in a virtual one instead of spreading over the various repos. That means all files at the root of their respective repo will be seen at the center of the graph, subdirectories around them.

The video clearly shows period of low activity, as well as period where a lot of files are being touched. There can be a bias there, because many files can be added with small edits, and one single commit in a single file can contain huge work that won’t be seen.

However, some period of intense activity and code base expansion are visible, like in august, with significant work on the hypernodes. Same thing since october, with work on next gen tornado wallet, and translation work from the whole community.

But at the same time a consistent work on various part of the codebase can be seen, by several devs - be them core dev or helpful contributors -. With the timescale of the video, rarely a single dev is at work at a given time.

Finally, it’s really enjoyable to see such a huge collaborative work on the various Bismuth repos, by several devs, over a whole year. Can’t wait to see what next year will have in store, I’m pretty sure the dev work will not slow down!