Stewards/Coding/DataGathering

From MozillaWiki
Jump to navigation Jump to search

Sources of data

  • Bugzilla (full MySQL data dump available; talk to Josh for details)
    • Interesting data: Comments, comment dates, comment authors, attachments (eg. code contributions), flag for first code contribution
  • mozilla-central or github
  • Mailing list archives of users who say they wish to write code for Mozilla
    • For correlation with other pieces of data, usually

Interesting data to be mined

  • Contributors who are no longer active
    • In progress, complete with ranking algorithm to prioritize prolific contributors
  • Mentoring effectiveness
    • Bugs in bugzilla can have mentor=foo annotations; determine ratio of fixed to open bugs for mentor foo and dig into results (number of different people commenting in bugs, number of code contributions, how active mentor is, etc.)
  • Breakdown about volunteer activity across groups of components (such as Core: DOM and friends)
    • Requires access to lists of employee names
  • Effectiveness of mentored bugs as stepping stone
    • Figure out how many new contributors (ie. people whose first contribution was in 2012) contributed to at least one mentored bug and at least one non-mentored bug
  • Identify common threads
    • For contributors who reach interesting levels on on the conversion conversion chart, analyze their interactions. Are there other contributors who keep showing up (either as mentor or just commenting frequently)?
    • Similarly, for contributors who disappear, analyze their interactions. Are there names that appear frequently (for multiple disappearing contributors, instead of in multiple bugs for a single contributor)?
  • Figure out at what stage people disappear
    • How often do they submit a patch but it doesn't get checked in?