|
|
(8 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
| =Data Set Documentation= | | =Data Set Documentation= |
| ==Longitudinal==
| |
| [[Telemetry/LongitudinalExamples|Complete documentation]]
| |
|
| |
|
| {{longitudinal data intro}}
| | This document now lives here: |
| | https://github.com/mozilla/telemetry-batch-view/blob/master/docs/choosing_a_dataset.md |
|
| |
|
| ==Main Summary==
| | [https://wiki.mozilla.org/api.php?action=query&list=backlinks&bltitle=Telemetry/Available_Telemetry_Datasets_and_their_Applications Wiki.mo pages linking to this dead page] |
| [https://github.com/mozilla/telemetry-batch-view/blob/master/docs/MainSummary.md Complete Documentation] | |
| | |
| Like the longitudinal dataset, main summary summarizes [https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/data/main-ping.html main pings]. Each row corresponds to a single ping. This table does no sampling and includes all desktop pings.
| |
| | |
| ===Caveats=== | |
| Querying against main summary on SQL.t.m.o/re:dash can '''impact performance for other users''' and can '''take a while to complete''' (~30m for simple queries). Since main summary includes a row for every ping, there are a large number of records which can consume a lot of resources on the shared cluster.
| |
| | |
| Instead, we recommend using the Longitudinal dataset where possible if querying from re:dash/sql.t.m.o. The longitudinal dataset samples to 1% of all data and organized the data by client_id. In the odd case where these queries are necessary, limit to a short submission_date_s3 range and ideally make use of the sample_id field. Even better, try using Spark.
| |
| | |
| ==Cross Sectional==
| |
| | |
| | |
| ==Client Count==
| |
| | |
| ==Crash Aggregates==
| |
| | |
| ==Mobile Metrics==
| |
| The android_events, android_clients, android_addons, and mobile_clients tables are documented here:
| |
| https://wiki.mozilla.org/Mobile/Metrics/Redash
| |