Elmo/Log Storage

From MozillaWiki
Jump to: navigation, search


Some terms in this document might be misleading, thus a quick glossary.

A buildbot build process, most often this will be a compare-locales source check. This is not related to binary bits uploaded to ftp. A build is a sequence of steps.
The buildbot steps make up the build process, and contain actions like updating a mercurial repository, or loading data off the network, or comparing en-US to localized sources. Each step can have multiple log files.
A buildbot log corresponds to a file on disk, possibly compressed with bz2.

Problem statement

Elmo stores the detailed output of the translation checks in the log files that are generated during the build that generates the data.

Buildbot stores those logs as individual file in a specific naming scheme into one directory per builder. The files itself can be any size from a few dozen bytes to several MB.

To serve that data to the web, the storage needs to be accessible to all web heads, for production, stage, and dev.

Specifying a strict retention policy to just cap the amount of data is not going to help, due to the scale of the problem. Just keeping the last recent file would make NFS unhappy. Also, the "current" data might be old. The signed-off data for Occitan Firefox 3.6 (we're still shipping that) is from December 2009. In fact, that data is lost. The most recent run is from January 2012, but only because Axel reran those builds with a new version of the source checks.