Template:Longitudinal data intro: Difference between revisions
Jump to navigation
Jump to search
(Adding initial stub for longitudinal set intro) |
(Fixing blockquote, text was running off the page) |
||
Line 8: | Line 8: | ||
As discussed in the [https://gist.github.com/vitillo/627eab7e2b3f814725d2 Longitudinal Data Set Example Notebook]: | As discussed in the [https://gist.github.com/vitillo/627eab7e2b3f814725d2 Longitudinal Data Set Example Notebook]: | ||
<blockquote> | |||
The longitudinal dataset is logically organized as a table where rows represent profiles and columns the various metrics (e.g. startup time). Each field of the table contains a list of values, one per Telemetry submission received for that profile. [...] | |||
The current version of the longitudinal dataset has been build with all main pings received from 1% of profiles across all channels with [...] up to 180 days of data. | |||
</blockquote> |
Revision as of 23:36, 29 July 2016
The longitudinal dataset is a summary of main pings. In general, you should prefer using the longitudinal set to main_summary unless there are extenuating circumstances.
In particular, the longitudinal dataset:
- makes it easy to report profile level metrics by grouping data for the same client-id in the same row
- samples to 1% of all recent profiles, which will reduce query computation time and save resources
As discussed in the Longitudinal Data Set Example Notebook:
The longitudinal dataset is logically organized as a table where rows represent profiles and columns the various metrics (e.g. startup time). Each field of the table contains a list of values, one per Telemetry submission received for that profile. [...]
The current version of the longitudinal dataset has been build with all main pings received from 1% of profiles across all channels with [...] up to 180 days of data.