User:Rhelmer:Build/test farm proposals

From MozillaWiki
Jump to navigation Jump to search

Operating principles

  • find more ways to work with other groups (QA, devs, IT)
    • not just requirements passing, but divide up work tasks among teams
  • work smarter not harder
    • fix root cause, not symptoms
  • nightly infrastructure should be a proving ground for release infrastructure
    • progression for new tools process is dev->staging->nightlies->releases
    • not applicable in every single case but something to strive for
  • whenever possible, work with existing open source products rather than reimplementing

Goals

  • redundancy
    • should not have to log into servers except under extraordinary circumstances
      • e.g. machine crash, unusual behavior, security breach
    • problems solved by re-imaging
      • e.g. locate problem, fix in ref platform (hotfix in running servers if severe), re-image servers)
  • reproducibility
    • ref image tracked, able to reproduce automatically
      • e.g. individual versions of tools
      • trap things like auto-updates
  • transparency
    • allow trusted users to make changes on their own (usually with review)
    • when trusted users are unable to make their own changes, provide a view into the process so the requestee feels in control. regular updates or on-demand status for example.

Specifics

Short term

  • determine ownage
    • build owns ref image/build&release harness, IT owns servers, QA owns test harness/tests, devs owns code/tests
      • Not set in stone! Individuals can and should help out wherever possible.
  • move Tinderbox to stand-alone (extracted) tests
    • extract "test-only-tinderbox" server scraper
    • run in "while" loop
  • remove test code from tinderbox (build-seamonkey-util.pl)
  • define ref platform for testing tinderboxes
  • build ref platform image for testing tinderboxes
    • build/qa do dev->staging, all groups help testing, IT does staging->prod
  • work with IT to automate rollout (e.g. Norton Ghost or other disk-imaging software)
  • determine method of re-imaging that will work for current Tinderbox setup
    • preed suggests having tinderbox dir on separate partion
  • work on shell handling, error code and logging (esp STDERR) problems in Tinderbox

Medium term

  • move to newer test harness (for example, Google has contributed one
    • alice is working on it, needs work to run on mac/linux. Mostly due to platform-specific data collection (e.g. memory)
    • enable trusted users (e.g. devs, QA) to add/remove tests as needed
      • auth/change control needed, possibly just version-controlled CVS file ala tinder-configs
  • drive tests using Buildbot
    • make testing machines fit into generic classes, instead of attaching specific tests to specific machines
    • port tinderbox "test-only-tinderbox" server scraper to status object
    • use "try" feature to allow uploading of builds by trusted users
    • Mozilla integration work by bhearsum at seneca to integrate with Tinderbox
  • move functionality out of tinderbox and into build system where applicable, or into separate scripts e.g.
    • l10n repack
    • deploy to stage
    • product-specific code (xforms, sunbird/lightning, firefox, etc.)
    • nightly update generation
  • use patcher2 for both nightly updates and production updates

Long term

  • drive builds using Buildbot
    • the basic pull/build functionality is provided by client.mk
      • might be worth calling build-seamonkey.pl if we haven't moved enough functionality out of Tinderbox yet (e.g. release builds are done this way)
    • make testing machines fit into generic classes, instead of attaching specific tests to specific machines
    • use "try" feature to allow uploading of patches by trusted users
    • use server upload/download feature to transfer builds to server machine (avoiding current stage/tinderbox-server lag race condition)
    • Mozilla integration work by bhearsum at seneca to integrate with Bonsai/Tinderbox
    • integrate the