39
edits
No edit summary |
No edit summary |
||
Line 96: | Line 96: | ||
* Plumb it in to the public facing dataset infrastructure, including metadata that links the public data back to the above review bug. | * Plumb it in to the public facing dataset infrastructure, including metadata that links the public data back to the above review bug. | ||
* Once the dataset has been published, it will be announced on the new Data @ Mozilla blog. It will also be added to https://docs.telemetry.mozilla.org/datasets/. | * Once the dataset has been published, it will be announced on the new Data @ Mozilla blog. It will also be added to https://docs.telemetry.mozilla.org/datasets/. | ||
<big>'''Definitions'''</big> | |||
'''Metric''' - A metric is anything we want to measure. | |||
Examples: the number of clients that used the developer tools console, the number of active clients | |||
'''Dimension''' - A dimension is a qualitative value such as OS, channel, or date. In practice, a dimension often defines a sub-population on which we can calculate a metric, allowing us to segment the metric for further analysis. | |||
Examples: if we have an OS dimension, we can analyze the number of active clients by OS; | |||
'''Aggregate''' - A combined value of many measurements (metric values), typically grouped by dimension or sets of dimensions. See also Aggregate Data. | |||
'''Individual-level Data''' - Data containing a dimension which uniquely identifies a single profile, user, client, etc. | |||
'''Tabular Data''' - Data that consists of rows (or records) and columns (or fields). Each row has the same number of columns, and each column represents a dimension or metric for that row. Think of a spreadsheet or CSV file as examples of this type of data. | |||
<big>'''Example Data'''</big> | |||
Here are some examples of data aggregated to the levels described above. | |||
* Level 7: raw data, with fine-grained timestamps | |||
* Level 6: individual-level data, aggregated to day-level time granularity | |||
* Level 5: anonymized individual-level data, identifiers replaced with pseudonyms | |||
* Level 4: probabilistic aggregates | |||
* Level 3: dimension-level aggregates without a minimum group size | |||
* Level 2: dimension-level aggregates with a minimum group size |
edits