Template:Longitudinal data intro: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Adding initial stub for longitudinal set intro)
 
(Fixing blockquote, text was running off the page)
Line 8: Line 8:
As discussed in the [https://gist.github.com/vitillo/627eab7e2b3f814725d2 Longitudinal Data Set Example Notebook]:
As discussed in the [https://gist.github.com/vitillo/627eab7e2b3f814725d2 Longitudinal Data Set Example Notebook]:


The longitudinal dataset is logically organized as a table where rows represent profiles and columns the various metrics (e.g. startup time). Each field of the table contains a list of values, one per Telemetry submission received for that profile. [...]
<blockquote>
The longitudinal dataset is logically organized as a table where rows represent profiles and columns the various metrics (e.g. startup time). Each field of the table contains a list of values, one per Telemetry submission received for that profile. [...]
   
   
The current version of the longitudinal dataset has been build with all main pings received from 1% of profiles across all channels with [...] up to 180 days of data.
The current version of the longitudinal dataset has been build with all main pings received from 1% of profiles across all channels with [...] up to 180 days of data.
</blockquote>

Revision as of 23:36, 29 July 2016

The longitudinal dataset is a summary of main pings. In general, you should prefer using the longitudinal set to main_summary unless there are extenuating circumstances.

In particular, the longitudinal dataset:

  • makes it easy to report profile level metrics by grouping data for the same client-id in the same row
  • samples to 1% of all recent profiles, which will reduce query computation time and save resources

As discussed in the Longitudinal Data Set Example Notebook:

The longitudinal dataset is logically organized as a table where rows represent profiles and columns the various metrics (e.g. startup time). Each field of the table contains a list of values, one per Telemetry submission received for that profile. [...]

The current version of the longitudinal dataset has been build with all main pings received from 1% of profiles across all channels with [...] up to 180 days of data.