Template:Longitudinal data intro: Difference between revisions

Jump to navigation Jump to search
Add big neon sign suggesting users focus on the l10l set
(fix intro)
(Add big neon sign suggesting users focus on the l10l set)
 
Line 1: Line 1:
The longitudinal dataset is a summary of main pings and differs from main_summary in two important ways:
The longitudinal dataset is a summary of main pings. If you're not sure which dataset to use for your query, this is probably what you want. It differs from the main_summary table in two important ways:
* The longitudinal dataset groups all data for a client-id in the same row. This makes it easy to report profile level metrics. Without this deduping, metrics would be weighted by the number of submissions instead of by clients.
* The longitudinal dataset groups all data for a client-id in the same row. This makes it easy to report profile level metrics. Without this deduping, metrics would be weighted by the number of submissions instead of by clients.
* The dataset uses a 1% of all recent profiles, which will reduce query computation time and save resources. The sample of clients will be stable over time.  
* The dataset uses a 1% of all recent profiles, which will reduce query computation time and save resources. The sample of clients will be stable over time.  
54

edits

Navigation menu