CommonVoice: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
m (→‎Materials & Assets to Use and Remix: add link to discourse materials thread)
Line 1: Line 1:
== What is Common Voice ==
== What is Common Voice ==


voice.mozilla.org
Mozilla Common Voice is an initiative to help teach machines how real people speak.


This project is an effort to bridge the digital speech divide. Voice recognition technologies bring a human dimension to our devices, but developers need an enormous amount of voice data to build them. Currently, most of that data is expensive and proprietary.
We want to make voice data freely and publicly available, and make sure the data represents the diversity of real people. Together we can make voice recognition better for everyone.
You can contribute today on the [https://commonvoice.mozilla.org/ Common Voice].
== How does Common Voice work? ==
We’re crowdsourcing an open-source dataset of voices, to start and support languages on Common Voice the following steps are made.
1. [https://github.com/common-voice/common-voice/issues/new?assignees=Heyhillary&labels=Type%3A+localization&template=language_request.md&title=LOCALIZATION+REQUEST%3A+ New language request] and localisation of Common Voice platform via Pontoon
2. Collecting and validating public domain sentences via the sentence collector, sentence extractor or CC0 text waiver agreement. 
3. Recording and validating the recordings of the sentences on the Common Voice platform
4. Repeating this process to grow the size of the
5. Generating a dataset which is released by Common Voice team
This dataset can then be used by developers to create voice-enabled technologies.
== Common Voice Communities ==
Common Voice wouldn’t be possible without our language communities. As of September 2021, we have [https://commonvoice.mozilla.org/en/languages 80 languages] launched for voice data collection.
Language community members and organisers; mobilise participation, provide valuable feedback and inspire us as a team. Our [https://common-voice.github.io/community-playbook/ Community Playbook] outlines how communities participate in Common Voice.
To support our communities our two main channels are [https://discourse.mozilla.org/c/voice discourse] for group and topical discussions and [https://chat.mozilla.org/#/room/#common-voice:mozilla.org matrix] for community chats. Our communities also have their own communication channels to help with [https://github.com/common-voice/common-voice/blob/main/docs/COMMUNITIES.md self-organising].


== Materials & Assets to Use and Remix ==
== Materials & Assets to Use and Remix ==

Revision as of 11:55, 3 September 2021

What is Common Voice

Mozilla Common Voice is an initiative to help teach machines how real people speak.

This project is an effort to bridge the digital speech divide. Voice recognition technologies bring a human dimension to our devices, but developers need an enormous amount of voice data to build them. Currently, most of that data is expensive and proprietary.

We want to make voice data freely and publicly available, and make sure the data represents the diversity of real people. Together we can make voice recognition better for everyone.

You can contribute today on the Common Voice.

How does Common Voice work?

We’re crowdsourcing an open-source dataset of voices, to start and support languages on Common Voice the following steps are made.

1. New language request and localisation of Common Voice platform via Pontoon

2. Collecting and validating public domain sentences via the sentence collector, sentence extractor or CC0 text waiver agreement.

3. Recording and validating the recordings of the sentences on the Common Voice platform

4. Repeating this process to grow the size of the

5. Generating a dataset which is released by Common Voice team

This dataset can then be used by developers to create voice-enabled technologies.

Common Voice Communities

Common Voice wouldn’t be possible without our language communities. As of September 2021, we have 80 languages launched for voice data collection.

Language community members and organisers; mobilise participation, provide valuable feedback and inspire us as a team. Our Community Playbook outlines how communities participate in Common Voice.

To support our communities our two main channels are discourse for group and topical discussions and matrix for community chats. Our communities also have their own communication channels to help with self-organising.

Materials & Assets to Use and Remix

If you make something, please add it below!

Also check Discourse for some more materials


Common Voice - Mini Business Card.png
Common Voice - Tabletop Tent Sign 300dpi.png
Common Voice XBanner 60x160.png
Common Voice XBanner - Event 60x160.png
Common Voice - Community Channnels Flyer.png