Stewards/Coding/DataGathering
Jump to navigation
Jump to search
Sources of data
- Bugzilla (full MySQL data dump available; talk to Josh for details)
- Interesting data: Comments, comment dates, comment authors, attachments (eg. code contributions), flag for first code contribution
- mozilla-central or github
- Mailing list archives of users who say they wish to write code for Mozilla
- For correlation with other pieces of data, usually
Interesting data to be mined
- Contributors who are no longer active
- In progress, complete with ranking algorithm to prioritize prolific contributors
- Mentoring effectiveness
- Bugs in bugzilla can have mentor=foo annotations; determine ratio of fixed to open bugs for mentor foo and dig into results (number of different people commenting in bugs, number of code contributions, how active mentor is, etc.)
- Breakdown about volunteer activity across groups of components (such as Core: DOM and friends)
- Requires access to lists of employee names
- Effectiveness of mentored bugs as stepping stone
- Figure out how many new contributors (ie. people whose first contribution was in 2012) contributed to at least one mentored bug and at least one non-mentored bug
- Identify common threads
- For contributors who reach interesting levels on on the conversion conversion chart, analyze their interactions. Are there other contributors who keep showing up (either as mentor or just commenting frequently)?
- Similarly, for contributors who disappear, analyze their interactions. Are there names that appear frequently (for multiple disappearing contributors, instead of in multiple bugs for a single contributor)?
- Figure out at what stage people disappear
- How often do they submit a patch but it doesn't get checked in?