Intellego: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(→‎Team: Add note about list of skills.)
(+Obsolete flag)
 
(16 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{RELEASE_MANAGEMENT_OBSOLETE}}
Intellego is a machine translation project for the benefit of Mozilla and the Open Web.
Intellego is a machine translation project for the benefit of Mozilla and the Open Web.
 
__NOTOC__
== Project details ==
== Project details ==
Intellego is a machine translation (MT) initiative that seeks to unify existing open MT projects by providing a single platform for engine developers and a unified web service that hosts a number of different language pairs/engines/implementations in the back end. The Intellego platform will allow users to select from a number of open MT engines based on the most prominent MT methodologies in order to find the best target MT output for their on-the-fly translation. This will be accomplished by partnering with existing open MT projects and hooking into their infrastructures, freeing Mozilla of the requirement to develop and host MT engines and promoting the use of MT engines within the fragmented open ecosystem.


; Our Mission
MT users are limited to using engines that follow a single MT methodology for all language pairs and content types. Studies have shown that a one-size-fits-all approach in MT does not provide the user with optimal translation output. Users need a single access point to different MT engines following different MT methodologies that will produce the best quality output by selecting the right engine for the right language pair. Intellego seeks to further establish an open MT ecosystem, as we feel it is the best way to quickly provide high-quality MT services to users on the web at the lowest cost and in a way that engages the open source community.
: To provide users with automated translation, from any language, to any language, in real time, on any software or device that is useful to them.
=== Intellego Platform ===
; Our Vision
Intellego is a centralized machine translation platform that provides the following services:
: A world where language is no longer a barrier to communication on the Internet and people can understand each other effortlessly regardless of their linguistic origins.
#A unified web service for open source machine translation engines, allowing them front-end accessibility on the Web. This service will function as follows:
; Our Values
#*Users insert text into a text field and see it translated on the fly in an accompanying text field.
: Language is free; language is beautiful; language deserves to be protected; a single language isn't the future; the Internet belongs to everyone and everyone has the right to participate and benefit from it.
#*Users insert an URL into a text field, Intellego strips the text nodes from the DOM, runs the text nodes through Intellego, and returns a rendered page in a new tab containing target language text.
; Our Motto
#*Users can select their preferred MT engine.
: We Will Be Understood.
#*Intellego will intelligently determine the most appropriate MT engine given a language pair and content type.
 
#*Advanced mode allows users to select and arrange MT pipelines utilizing multiple MT engines
=== Technical details ===
#*Terminology processing utility (from GSOC project) will ensure accurate and consistent terminology translation despite MT method.
 
#An easy and simple API for the following services
; Intellego will be a machine translation platform consisting of various open source engines based on the most prominent approaches to MT (i.e., SMT, RBMT, EBMT, and hybrid).
#*A widget for developers to add to their sites for automatic translation by Intellego MT engines
: Research has revealed that certain approaches produce better output for certain language pairs and content types. For example, Russian's grammar is incredibly complex. So much so, that SMT output for Russian is usually very flawed. The RBMT approach has demonstrably produced better output for language pairs that include Russian. In addition, RBMT is best suited for long sentences and structured content, like wikis, whereas SMT is best suited for short sentences and user generated content.
#*Allows mozilla l10n tools developers to link directly into hosted open MT services through Intellego
; Intellego will aim to provide a single API for engine developers and a unified web service that hosts a number of different language pairs/engines/implementations in the back end.
#Users can post-edit MT output in context and submit as feedback (leveraging Pontoon modules). Community can vote on suggested post-edits, and highest voted feedback is entered into the engine corpuses.
: This aims to increase accessibility to smaller, more efficient MT engines on the web. Kevin Scannell gave this example: "If you look at Apertium for example, there are some language pairs that are better performing than Google Translate, and many pairs that Google doesn't support at all.  But they don't have the infrastructure to keep a web service up and running (they've tried and it's been up and down)." This will help break up the proprietary nature of MT and allow for a greater presence of open MT on the web.
: In addition, it will help to satisfy our aim to make the Intellego platform available through an open API and web services, as is stated on the wiki.
; The GSoC terminology-based project will serve as a pre-processing utility in the Intellego MT process to provide accuracy in translation.
: Research suggests that when a user evaluates MT output, they tend to be more accepting of MT error when it is grammar based, rather than terminology based.
 
== Project meetings ==
 
The Intellego team [[/Meetings/Status|meets every week]] to discuss the progress of the project.


We also occasionally have [[/Meetings/Sprints|sprint meetings]], where we work on a particular aspect of the project for a long stretch of time.
Explore the wiki for more details about the [[Intellego/Mission|Intellego project's purpose and focus]] as well as the [[Intellego/Goals Milestones|goals and milestones]] for the platform's development and the [https://intellego.etherpad.mozilla.org/tech-spec-process-model platform technical spec (in progress)].
 
For more information about meetings, see our [[/Meetings|meetings page]].


== Resources ==
== Resources ==
 
* [[Intellego/Goals_Milestones|Project goals and milestones]]
* [[Intellego/Research|MT Research]]
* [[Intellego/Research|Our MT Research]]
* [[Intellego/Mission|Project philosophy]]
* [http://intellego.etherpad.mozilla.org/ Intellego team etherpad]
* [http://intellego.etherpad.mozilla.org/ Intellego team etherpad]
* Interested in working on Intellego for Google Summer of Code?  Tell us about yourself on our [https://wiki.mozilla.org/Intellego/GSoC Google Summer of Code page].
* Related effort: [[mw:Content translation]], [[mw:Content translation/cxserver|cxserver]]


== Team ==
== Team ==
Line 42: Line 35:
; {{Mozillian|gueroJeff|Jeff Beatty}} (gueroJeff)
; {{Mozillian|gueroJeff|Jeff Beatty}} (gueroJeff)
: '''Team lead.'''
: '''Team lead.'''
: Localization, solutions architect, organization, programming.
: Localization, organization, programming.
; {{Mozillian|Kensie|Majken Connor}} (Kensie)
; {{Mozillian|Kensie|Majken Connor}} (Kensie)
: Community outreach, evangelism.
: Community outreach, evangelism.
Line 49: Line 42:
; {{Mozillian|mekki|Mekki MacAulay}} (mekki)
; {{Mozillian|mekki|Mekki MacAulay}} (mekki)
: Strategic management, partnerships, grants, business collaboration, evangelism.
: Strategic management, partnerships, grants, business collaboration, evangelism.
== Project meetings ==
The Intellego team [[/Meetings/Status|meets every week]] to discuss the progress of the project.
We also occasionally have [[/Meetings/Sprints|sprint meetings]], where we work on a particular aspect of the project for a long stretch of time.
For more information about meetings, see our [[/Meetings|meetings page]].


== Discussion ==
== Discussion ==
Line 54: Line 55:
* IRC: {{IRC|intellego}}
* IRC: {{IRC|intellego}}
* Newsgroup: [http://www.mozilla.org/about/forums/#tools-l10n mozilla.tools.l10n]
* Newsgroup: [http://www.mozilla.org/about/forums/#tools-l10n mozilla.tools.l10n]
* [https://discourse.mozilla-community.org/c/intellego Forum]

Latest revision as of 13:48, 2 January 2019

Warning: The content of this page is obsolete and kept for archiving purposes of past processes.

Intellego is a machine translation project for the benefit of Mozilla and the Open Web.

Project details

Intellego is a machine translation (MT) initiative that seeks to unify existing open MT projects by providing a single platform for engine developers and a unified web service that hosts a number of different language pairs/engines/implementations in the back end. The Intellego platform will allow users to select from a number of open MT engines based on the most prominent MT methodologies in order to find the best target MT output for their on-the-fly translation. This will be accomplished by partnering with existing open MT projects and hooking into their infrastructures, freeing Mozilla of the requirement to develop and host MT engines and promoting the use of MT engines within the fragmented open ecosystem.

MT users are limited to using engines that follow a single MT methodology for all language pairs and content types. Studies have shown that a one-size-fits-all approach in MT does not provide the user with optimal translation output. Users need a single access point to different MT engines following different MT methodologies that will produce the best quality output by selecting the right engine for the right language pair. Intellego seeks to further establish an open MT ecosystem, as we feel it is the best way to quickly provide high-quality MT services to users on the web at the lowest cost and in a way that engages the open source community.

Intellego Platform

Intellego is a centralized machine translation platform that provides the following services:

  1. A unified web service for open source machine translation engines, allowing them front-end accessibility on the Web. This service will function as follows:
    • Users insert text into a text field and see it translated on the fly in an accompanying text field.
    • Users insert an URL into a text field, Intellego strips the text nodes from the DOM, runs the text nodes through Intellego, and returns a rendered page in a new tab containing target language text.
    • Users can select their preferred MT engine.
    • Intellego will intelligently determine the most appropriate MT engine given a language pair and content type.
    • Advanced mode allows users to select and arrange MT pipelines utilizing multiple MT engines
    • Terminology processing utility (from GSOC project) will ensure accurate and consistent terminology translation despite MT method.
  2. An easy and simple API for the following services
    • A widget for developers to add to their sites for automatic translation by Intellego MT engines
    • Allows mozilla l10n tools developers to link directly into hosted open MT services through Intellego
  3. Users can post-edit MT output in context and submit as feedback (leveraging Pontoon modules). Community can vote on suggested post-edits, and highest voted feedback is entered into the engine corpuses.

Explore the wiki for more details about the Intellego project's purpose and focus as well as the goals and milestones for the platform's development and the platform technical spec (in progress).

Resources

Team

These are the members of our Intellego team, with a brief overview of their relevant skills:

Jeff Beatty (gueroJeff) (gueroJeff)
Team lead.
Localization, organization, programming.
Majken Connor (Kensie) (Kensie)
Community outreach, evangelism.
Gordon P. Hemsley (GPHemsley) (GPHemsley)
Linguistics, programming, BCP 47 (language tags).
Mekki MacAulay (mekki) (mekki)
Strategic management, partnerships, grants, business collaboration, evangelism.

Project meetings

The Intellego team meets every week to discuss the progress of the project.

We also occasionally have sprint meetings, where we work on a particular aspect of the project for a long stretch of time.

For more information about meetings, see our meetings page.

Discussion