So Lydia asked me about having slightly more fine-grained information about daily commits. She pointed me to this video which at the 15-minute mark has a visualisation for people contributing to Wikipedia. This visualisation reveals information about for how long people have been contributing to the community.
So, as a distraction from my work on detecting core team members, I coded a tool that does the following:
- Scans the entire log for a project’s history and finds the date upon which an SVN account commits for the first time;
- Per day finds the number of commits by people who have been “around” for less than 6 months, less than 12 months (but more than 6), less than 2 years (but more than 1) and those around for more than 2 years;
- Plots this data for arbitrary time periods using gnuplot.
So let’s start by taking a look at KDE SVN for 2010:
There is a few things to note here:
- The overall downward trend in daily commits (probably caused by people switching to git);
- The vast majority of work is being conducted by people who have been in KDE for over 2 years.
There is probably a good reason why this group is responsible for the majority of the commits; the most obvious reason being that it is the largest group. It is also possible to posit that people in this group, on average, have a higher commit rate (I would argue this is less likely though). I will do some more on this point later…
Having shared this plot with someone earlier they came up with an interesting suggestion… Perhaps the people who have been around longest are the most resistant to switching to git. This is an interesting thought and easy-enough to test. Let’s look at the same plot for 2009:
Just by looking you can see that, overall, the KDE SVN commit rate has dropped between 2009 and 2010; almost certainly caused by migration to git. But are the golden oldies really the ones holding on to SVN the tightest? Actually, no.
Between 2009 and 2010 the change in average daily commit rate, per “age” category was as follows (roughly):
- < 6 months: -28%
- < 12 months: -17%
- < 24 months: -7%
- > 24 months: -13%
There is nothing particularly significant in this (statistically or otherwise). There goes that theory.
One last thing that I think is worth mentioning. Take a look at the < 6 months commits for 2010. Notice a growth pattern around the Summer? I think you need to look really carefully to see the same in 2009. Still, I think there is some indication here of the impact of Google Summer of Code and Season of KDE.