Data/WorkingGroups/GleanDictionary

From MozillaWiki
< Data
Revision as of 15:33, 8 February 2021 by Wlach (talk | contribs) (Public working group)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Quick Reference

Development URL dictionary.protosaur.dev
Source code mozilla/glean-dictionary on GitHub
Matrix Channel #glean-dictionary on chat.mozilla.org
Meetings Fortnightly cadence on Wednesdays at 5pm UTC.

Zoom Room: 91218334714

Meeting Notes open to the public

Charter

The focus of this working group is to produce a usable version of the Glean Dictionary, a data dictionary for applications written using Glean SDK. This includes Firefox for Android and Firefox for iOS. As Firefox on Glean comes together, the Glean Dictionary will also index the metadata sent by Firefox Desktop.

There are currently four planned phases for the project:

  1. (done) Index Glean metric and ping data produced by Glean applications, providing links to their representation in BigQuery
  2. (in progress) Index derived datasets produced by bigquery-etl, creating generated documentation at bigquery-etl
  3. (not started) Represent and link the dataset documentation outlined above in the Glean Dictionary
  4. (planning) Add facilities to annotate glean metrics and pings with additional data and commentary useful for data scientists and other data practitioners at Mozilla

After phase 4 of the project is completed, we will re-evaluate the future of this group.

Communication

There are three primary communication channels for the group:

  • A fortnightly Zoom meeting open to all, for synchronous discussion and larger-scale strategic planning
  • The #glean-dictionary channel on Matrix for quick questions
  • Discussions in GitHub issues for design discussions and all other questions

Generally, discussions in GitHub issues is preferred since it can happen asynchronously and leaves a historical record that we can refer back to. For larger changes, consider writing a proposal (see below).

All project communications should follow the Mozilla Community Participation Guidelines.

Stakeholders

Glean team, Data science, the Data Taxonomy Effort, other consumers of data.

Membership

We welcome your feedback and involvement! We work in the open and anyone from the Mozilla community is welcome to join this group. This project involves a variety of pieces emphasizing different technologies, including:

  • Building out the frontend (JavaScript, svelte)
  • Working on the data infrastructure pieces to gather metadata (python, BigQuery)
  • Improving data documentation and metadata definitions through Mozilla (markdown, yaml, python)

If you want to contribute, but aren't sure where to start, join our #glean-dictionary matrix channel and say hi! Someone can probably find an initial task for you to work on.

Coordination

This group is currently being coordinated by Will Lachance (wlach on Github, Matrix, and Mozilla Slack), who is responsible for creating the meeting agenda and ensuring that discussions lead to a productive places. Feel free to get in touch if you have feedback or questions!

Proposals

Proposal Date Status
Working Group Proposal 20 November 2020 Accepted
Initial Proposal 10 August 2020 Accepted