CommonVoice: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 7: Line 7:
We want to make voice data freely and publicly available, and make sure the data represents the diversity of real people. Together we can make voice recognition better for everyone.
We want to make voice data freely and publicly available, and make sure the data represents the diversity of real people. Together we can make voice recognition better for everyone.


You can contribute today on the [https://commonvoice.mozilla.org/ Common Voice].
You can contribute today on [https://commonvoice.mozilla.org/ Common Voice].


== How does Common Voice work? ==
== How does Common Voice work? ==

Revision as of 12:11, 3 September 2021

What is Common Voice

Mozilla Common Voice is an initiative to help teach machines how real people speak.

This project is an effort to bridge the digital speech divide. Voice recognition technologies bring a human dimension to our devices, but developers need an enormous amount of voice data to build them. Currently, most of that data is expensive and proprietary.

We want to make voice data freely and publicly available, and make sure the data represents the diversity of real people. Together we can make voice recognition better for everyone.

You can contribute today on Common Voice.

How does Common Voice work?

We’re crowdsourcing an open-source dataset of voices, to start and support languages on Common Voice the following steps are made.

1. New language request and localisation of Common Voice platform via Pontoon

2. Collecting and validating public domain sentences via the sentence collector, sentence extractor or CC0 text waiver agreement.

3. Recording and validating the recordings of the sentences on the Common Voice platform

4. Repeating this process to grow the size of the dataset

5. Generating a dataset which is released by the Common Voice team

This dataset can then be used by developers to create voice-enabled technologies.

Common Voice Communities

Common Voice wouldn’t be possible without our language communities. As of September 2021, we have 80 languages launched for voice data collection.

Community Playbook

Language community members and organisers; mobilise participation, provide valuable feedback and inspire us as a team. Our Community Playbook outlines how communities participate in Common Voice.

Communications Channels

To support our communities our two main channels are discourse for group and topical discussions and matrix for community chats. Our communities also have their own communication channels to help with self-organising.

We share weekly updates from the Common Voice Team on discourse, this coordinated by Hillary, Common Voice Community Manager.

Materials & Assets to Use and Remix

If you make something, please add it below!

Also check Discourse for some more materials


Common Voice - Mini Business Card.png
Common Voice - Tabletop Tent Sign 300dpi.png
Common Voice XBanner 60x160.png
Common Voice XBanner - Event 60x160.png
Common Voice - Community Channnels Flyer.png