Auto-tools/Projects/OrangeFactor/2010-09-22

< Auto-tools‎ | Projects‎ | OrangeFactor
Revision as of 13:33, 23 October 2014 by Edmorley (talk | contribs) (Edmorley moved page Auto-tools/Projects/WarOnOrange/2010-09-22 to Auto-tools/Projects/OrangeFactor/2010-09-22: Renaming to the actual name of the webapp)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

War on Orange meeting, September 22, 2010

Parsing:

  • logparser lives here: http://hg.mozilla.org/automation/logparser
    • THIS DOESN'T WORK! its a straight port from topfails (which is to say it works but is inadequate and there are also bugs on it)
  • should be done to jython standards for hadoop

Storage:

  • files in filesystem mirror that from the ftp site
  • (raw) log -> parser -> flume (sp?) -> hdfs
  • block size: 128M
    • does this make looking through files slow?

What do we want?

  • we have a (proposed) schema
  • we have a (proposed) REST interface
  • (we should put this on a wiki page and move towards finalization)

Process:

  • we give python script (e.g. logparser)
  • invoked on every log file
  • output == what we want