Cooperation in Open Source Developement. An Empirical Study of the Debian Project.

Text

Presentation at Debconf 9

Datasets

For the statistical analysis I prepared several datasets out of an PostgreSQL database.

  • Data set about all bugs, bug submitters and bug fixers: bugtime.RData
  • Activity of contributors by months and by "activity periods": activity months.csv, activity periods.csv
  • An SQL dump (anonymized) of the whole database is available on request. It's a bit too large (250Mb) to just put it on this website.
  • A version of the bug dataset including all statistical analysis results is also available on request (550Mb).

Scripts

Data Gathering

All the scripts used for data gathering and preparation are written in Python. They are Free Software licensed under the General Public License (GPL). The source tarball includes a README file with further documentation.

debian data scripts.tar.gz

Statistical Analysis

coming soon