Connected Devices/Projects/Metrics/Database and Visualization

From MozillaWiki
< Connected Devices‎ | Projects‎ | Metrics
Revision as of 17:00, 15 June 2016 by Doliver (talk | contribs) (data desc)
Jump to navigation Jump to search

Summary

We are using a combination of Google Analytics, Google BigQuery database, and Re:Dash to enable teams to view & analyze project data. This page will describe what data we have and how to access it.

BigQuery

BigQuery is Google's cloud-based big data solution. We are using it as a store for all of the data sent to Google Analytics so that teams can view the full detail of all events sent by their application, rather than the aggregated views that are presented in the Google Analytics dashboards.

Data is imported in batches from Analytics several times per day. Note that there may be a delay of up to 8 hours before event data is exported, though it usually seems to arrive within 2 hours.

BigQuery supports a subset of SQL, covering most common clauses and operators. For a full reference, see https://cloud.google.com/bigquery/query-reference.

Table Schema

ga_sessions Tables

The Analytics data is exported into a series of tables, which are named as follows:

  • ga_sessions_20160613 - This table contains all event data recorded on 2016-06-13. A new table is created for each day's data.
  • ga_sessions_intraday_20160614 - This table contains all event data that has been exported for the current day, and batches of data will be appended to it several times during the day.

If you are familiar with Google Analytics data and/or working with multi-dimensional data, you are welcome to query these tables directly. You can find the schema reference here: https://support.google.com/analytics/answer/3437719

Simplified Views

With the Connected Devices metrics solution, we are only using a relatively small subset of the data available in Google Analytics. To simplify the picture, we have created a couple of views to allow easier access to the information. These views are flattened and can be queried with simple SQL.

There are currently two primary views available:

  • AllEvents_7Day
  • AllEvents_30Day

The schema for these tables is identical:

Column Data Type Description
HitTimestamp String The UTC datetime in which the event was received by Google Analytics. This is a timestamp with the format "YYYY-MM-DD 24:MM:SS", e.g. "2016-06-11 21:00:54".
HitTimestampUsec Integer The UTC time (in microseconds) in which the event was received by Google Analytics, e.g. 1465678854000000.
EventTimestamp String The UTC datetime in which the event was recorded by the Metrics library, if you are using that method to record your events. Because the library may batch up your events before sending them to GA, this can be a more accurate time of the event occurrence. If you are posting your events directly via URL, you can disregard this column and use HitTimestamp instead.
FullVisitorId String A unique ID generated for the user, based on (but not equal to) the Client ID parameter in the API. Repeated events from the same client ID will result in the same FullVisitorId value.
VisitId Integer Example
VisitNumber Integer Example
HitNumber Integer Example
AppName String Example
EventCategory String Example
EventAction String Example
EventLabel String Example
EventValue Integer Example
EventValueFP String Example
OS String Example
OSVersion String Example
Device String Example
Arch String Example
AppPlatform String Example
AppBuildID String Example

(For reference, the query used to create these views can be found at: https://github.com/dylano/metrics/tree/master/sql)

If these views are insufficient for your project or you'd like to discuss other ways to view or use the data, feel free to contact the CD Metrics team. (#cd-metrics, :doliver)


Data/Vis

  • G analytics
    • file bug to request access
  • re:dash
** soon to be available at sql.telemetry.m.o
  • periscope
** contact doliver for access
  • other tools may be able to access
** contact doliver