Blog Entries

August 2019
M T W T F S S
« May    
 1234
567891011
12131415161718
19202122232425
262728293031  

Respect Your Elders!

So, in my previous blog post, I talked a little about how we can show if it is the newcomers or the “oldies” that are the most active contributors to KDE SVN. Let’s jog our memories by taking another look at the 2010 data I previously posted:

Daily commits in 2010 (click to enlarge)

What are we looking at here? This shows, for each day in 2010, the number of daily commits in KDE SVN. Each day is colour coded to show the commits made by those contributors who have been “around” fewer than 6 months since their first commits, less than one year, less than two years and more than two years.

It is plain to see that the contributors who have been involved for more than two years are contributing the most (commits per day). Now the question I left us with last time was: Is this because there are a larger number of committers in this category or is the commit rate just higher?

Let’s take a look, starting with commit rate:

Average daily commits, in 2010, per committer in each category (click to enlarge)

This plot shows, per day in 2010, the average number of commits made by the people in each involvement category. It is mostly a mass of dots, right? But what it does tell us is that the people who have been around the longest do not have a massively increased commit rate above the others. In fact, the data behind the plots shows that on an average day, the average commit rates are as follows:

  • < 6 months: 3
  • < 1 year: 3
  • < 2 years: 4
  • > 2 years: 5

This would lead us to believe it must simply be that there are more active people in the “> 2 years” category. So let’s take a look at the number of daily committers per category per day in 2010:

Daily committers, in 2010, per committer category (click to enlarge)

So there you have it! Clearly, there are simply more contributors fitting into the “> 2 years” category. On an average day in 2010, the number of committers in each category was:

  • < 6 months: 10
  • < 1 year: 8
  • < 2 years: 14
  • > 2 years: 54

So, here’s a question for you all: Does it feel a little odd to you that the committers who have been around for fewer than 6 months have similar commit rates to those who have been around for more than 2 years?

Be Sociable, Share!

11 comments to Respect Your Elders!

  • Thomas Pfeiffer

    Figure 2 shows pretty clearly that long-term committers are not contributing a lot more per person than newer ones.

    The third figure alone, however, does not necessarily show that there are more long-term commiters. If the newer commiters just commited less regulalry (like, say, only every third day), only one third of them would have commited on the “average day”. It only clearly shows it in conjunction with the second one (but so does the first one ;) )
    Sorry for being pedantic here, I blame it on my education in Psychology teaching me to always look twice at statistics ;)

    Now to give my opinion regarding your question:
    It does not seem that odd to me, actually. Of course there are new commiters who just do a few commits and then leave, but I can imagine that their commit rate starts pretty high because they want to fix or implement something they personally want (and since they are new, they probably won’t get it perfectly with their first commit) and after that they just drop out of the statistics because they don’t commit at all after that.
    And even those who stay are probably going to commit a lot at the beginning because they are most highly motivated. And those together with those who gradually lose interest over the first few months seem to result in the same average than the long-term commiters.

    So what I’d be intersted is whether the variance in the commit rate is higher among new contributors than among long-term contributors. Figure 2 looks like this might be the case, but having it in numbers would be great ;)

    • Paul Adams

      Hi Thomas,

      Thanks for taking the time to comment.

      Yes, you are quire right, it is only in the context of all three plots that we can come to the conclusion that there more long-term committers. That is exactly why all three appear in this post. Interesting question regarding variance in commit rates. This is more statisticsy (for want of a real word) than I would normally put in this blog… I will come back with another comment later to answer this.

  • Daniel

    Interesting. It seems to me that “Does it feel a little odd to you that the committers who have been around for fewer than 6 months have similar commit rates to those who have been around for more than 2 years?” refers to 3 commits per day (for 2 years). I guess this is pretty subjective, to me 3 and 5 commits per day could be a big difference, if it had been 3 vs 6, then you could say that a >2 years contributor makes twice as many commits. Do you know if long term committers get into a habit of making smaller commits (which could be considered a good practice in working with a source control system)? However, this question starts getting dangerously close LOC as a measure of progress, so it probably does not have a very useful statistical answer. Nevertheless, this kind of statistics is cool, thanks!

    • Paul Adams

      Hi Daniel,

      Thanks for your comment.

      Do you know if long term committers get into a habit of making smaller commits (which could be considered a good practice in working with a source control system)?

      I’m afraid I don’t, no. I agree, however, that we have to stop short of getting into LOC. In my work, if needs be, I typically assume that an individual’s average patch size does not fluctuate much over time. I also assume that the average patch size varies significantly between developers (because of differing personal process, because of differing style, because of working on different types are artefact).

      The reason I asked the question about commit rate between newer and older developers is to do with ramp-up. We would normally expect that people new to a project are not fully ramped-up (not working to their full potential yet). However, if they are committing at a similar rate to the “oldies”, speed is clearly not the problem. Perhaps their patches are smaller. Perhaps the quality is lower. These questions are lower-level than I normally go down to.

  • Daniel

    ups, it should have said “3 commits per day (for 2 years contributors), sorry.

  • hmmm

    Should we be worried about the clear downward trend in commits? Or is it an effect of the progressive migration to git?

    • Paul Adams

      Good question! The downward trend is caused by the move to git, imo.

      • hmmm

        Is it possible to have the stats also with the migration to git? Or is git unsuitable for the analyses you are doing with svn?

  • Sekar

    It would be interesting to see top 10 committers of all time!!

    • Paul Adams

      Wouldn’t it just!? Up until September 20th (when I last updated my SVN log, they are):
      scripty 170129 (not actually a human)
      mlaurent 36378
      dfaure 31629
      cgilles 23069
      coolo 20913
      mueller 17285
      ilic 10907
      aseigo 10769
      pino 10512
      winterz 10295
      aacid 9997

  • yoda

    hey, great statistics, can we ask for plot how many new contributors joined kde in past years?