CloudServices/Sagrada/Metlog

< CloudServices‎ | Sagrada
Revision as of 17:22, 4 October 2011 by Rmiller (talk | contribs)

Overview

The Metrics project is part of Project Sagrada, providing a service for applications to capture and inject arbitrary data into a back end storage suitable for out-of-band analytics and processing.

Project

Engineers

  • Rob Miller
  • Victor Ng


User Requirements

The first version of the Metrics system will focus on providing an easy mechanism for the Sync and BrowserID projects (and any other internal Mozilla services) to efficiently send profiling data and any other arbitrary metrics information that may be desired into one or more backend storage locations. Once the data has made it to its final destination, there should be available to those w/ appropriate access the ability to do analytics queries and report generation on the accumulated data.

Requirements:

  • Services apps should be provided an easy to use API that will allow them to send arbitrary text data into the metrics and reporting infrastructure.
  • Processing and I/O load generated by the API calls made by the services apps must be extremely small to allow for minimal impact on app performance even when there is a very high volume of messages being passed.
  • API should provide a mechanism for arbitrary metadata to be attached to every message payload.
  • Overall system should provide a sensible set of message categories so that commonly generated types of messages can be labeled as such, and so that the processing and reporting functionality can easily distinguish between the various types of message payloads.
  • Message taxonomy must be easily extendable to support message types that are not defined up front.
  • Message processing system must be able to distinguish between different message types, so the various types can be routed to the appropriate back end(s) for effective analysis and reporting.
  • Service app owners must have access to an interface (or interfaces) that will provide reporting and querying capabilities appropriate to the various types of messages that have been sent into the system.


Proposed API

The atomic unit for the Services Metrics system is the "message". The structure of a message is inspired by that of the well known syslog message standard, with some slight extensions to allow for arbitrary metadata. Each message will consist of the following fields:

  • timestamp: Time at which the message is generated.
  • logger: String token identifying the message generator, such as the name of the service application in question.
  • severity: Numerical code from 0-7 indicating the severity of the message, as defined by RFC 5424.
  • message: Message text payload.
  • metadata: Arbitrary set of key/value pairs that indicates the type of message that is being generated and includes any additional data that may be useful for back end reporting or analysis.

We will provide a "MetLog" library that will both ease generation of these messages and that will handle packaging them up and delivering them (via UDP) into the message processing infrastructure. Implementations of this library will likely be available in both Python and Javascript, but the Python library will be available first and this document will, for now, only describe the Python API. The Javascript API will be similar, modulo syntactic sugar that is available in Python but not in JS (e.g. decorators, context managers), and will be documented in detail in the future. The proposed Python API is as follows:

set_metlog_dest(host, port)

   Specifies the address and port of the metlog listener, the destination of
   the UDP packets that will be sent out as a result of subsequent metlog
   calls.  The Services Python framework will provide a mechanism to specify
   this via configuration files so services authors won't have to make this
   call themselves.

set_default_logger(logger)

   Specifies a logger value to use as the default for all subsequent
   metlog calls in which a logger value is not explicitly provided.

set_message_flavor(flavor_name, metadata)

   The metadata for a given message can be used to label and categorize that
   message.  This function expects a string value flavor_name and a
   dictionary metadata.  The flavor name value can be passed in as a
   flavor to subsequent metlog calls as shorthand for including the
   specified metadata in the outgoing message.

metlog(timestamp=None, logger=None, severity=6, message="", metadata=None, flavors=None)

   Sends a single log message to the previously specified metlog listener.
   Most of the arguments correspond to the message fields described above.
   None of them are strictly required, but most of them will be populated by
   reasonable defaults if they aren't provided:
   * timestamp: Defaults to current system time.
   * logger: Defaults to what has been specified using the
     set_default_logger call, or to an empty string if
     set_default_logger hasn't been called.
   * severity: Defaults to 6 ("Informational")
   * message: Defaults to an empty string
   * metadata: Defaults to an empty dictionary
   * flavors: Any specified flavors will cause this message's metadata
     value to be updated to contain the flavor's metadata; defaults to an
     empty list