Template:Longitudinal data intro: Difference between revisions

Add big neon sign suggesting users focus on the l10l set
(fix intro)
(Add big neon sign suggesting users focus on the l10l set)
 
Line 1: Line 1:
The longitudinal dataset is a summary of main pings and differs from main_summary in two important ways:
The longitudinal dataset is a summary of main pings. If you're not sure which dataset to use for your query, this is probably what you want. It differs from the main_summary table in two important ways:
* The longitudinal dataset groups all data for a client-id in the same row. This makes it easy to report profile level metrics. Without this deduping, metrics would be weighted by the number of submissions instead of by clients.
* The longitudinal dataset groups all data for a client-id in the same row. This makes it easy to report profile level metrics. Without this deduping, metrics would be weighted by the number of submissions instead of by clients.
* The dataset uses a 1% of all recent profiles, which will reduce query computation time and save resources. The sample of clients will be stable over time.  
* The dataset uses a 1% of all recent profiles, which will reduce query computation time and save resources. The sample of clients will be stable over time.  
54

edits