User:GPHemsley/BCP 47: Difference between revisions
Jump to navigation
Jump to search
(IANA, not IETF) |
|||
Line 65: | Line 65: | ||
====Desired state==== | ====Desired state==== | ||
* Localization teams only localize language names (etc.) that are commonly localized in their locale. Otherwise, they default to the value on the master list. | * Localization teams only localize language names (etc.) that are commonly localized in their locale. Otherwise, they default to the value on the master list. | ||
* To simplify the work of localizers, a list of "commonly localized" language names should be provided, extending the list currently in languageNames.properties (perhaps according to Kevin Scannell's suggestions in Bug 356038, Comment 37) | |||
* Any valid combination of subtags (per BCP 47) can create a Firefox localization team and locale. | * Any valid combination of subtags (per BCP 47) can create a Firefox localization team and locale. | ||
* Locales that are not valid subtag combinations are phased out (e.g. 'jp-JP-mac'). | * Locales that are not valid subtag combinations are phased out (e.g. 'jp-JP-mac'). |
Revision as of 03:25, 10 June 2011
Summary
This is the plan for addressing bug 356038, the implementation of BCP 47.
Plan
- Create master list of languages
- Tie in spellcheckers
- Upgrade Languages UI to use central list
- Convert L10n mechanisms to use central list
- Improve Languages UI
Tasks
- Update the JS code to handle all the new requirements in BCP 47.
- Update list of language names ('Primary Language Subtags').
- Do we exclude extinct/historical languages? If so, based on what criteria?
- Update list of region names ('Region Subtags').
- Who or what decides when to differ from how the IANA registry lists regions?
- Add list of script names ('Script Subtags').
- Intentionally ignore 'Extended Language Subtags', as they are generally for backwards compatibility with 'Primary Language Subtags' that represent macrolanguages.
- Intentionally ignore 'Redundant Registrations', as they are generally for backwards compatibility and can be composed of other valid subtags.
- Decide how to handle 'Variant Subtags', 'Extension Subtags', and 'Private Use Subtags', as well as 'Grandfathered Registrations', as it is unclear how they will come into play with regard to language selection or localization.
- Decide how to clean up and/or reorganize language groups.
- Can they be superseded by 'Script Subtags'?
- Decide whether specifying the "accepted" languages is necessary.
- What are the reasons for a language not being "accepted"?
- Decide how to separate the l10n-necessary language names from the l10n-unnecessary language names.
- Do we separate 2-char vs. 3-char, or do we use another method?
- Decide how to improve the Languages selection interface.
- Decide how we should handle 'q' values in the Accept-Language header.
- Should we just allow them to be automatically generated from the given order, as is apparently the existing behavior?
Areas of focus
Master list
Current state
- There is no true master list of language names.
- All 2-letter and a handful of 3-letter codes have associated language names within the 'en-US' locale.
- Language names are essentially arbitrary and subjective, with changes made to politically-charged language names and places.
Desired state
- A master list of language names and associated information, based on the official IANA database.
- Allow locales (including 'en-US') to localize (override) the names in the master list.
- This would be where politically-charged names would be changed.
Language preferences UI
Current state
- Very limited support for language codes, supporting only 'Primary Language Subtags' and 'Region Subtags' (in a limited way).
- Not up-to-date with BCP 47.
- No support for additional subtags (including 'Script Subtags') or hard-coded 'q' values.
- The language UI merely takes the value of the value of the 'intl.accept_languages' preference and splits it by comma and then a single hyphen.
- Proper parsing (which is already limited) is only done if the language tag is of the format 'xx' or 'xx-ZZ', and only if the corresponding names are available. Otherwise, the item is displayed unparsed.
Desired state
- All possible valid combinations of subtags are supported, with corresponding names available.
- More intuitive manipulation in the UI.
L10n
Current state
- A full list of language and region names must be re-localized for each locale.
- Subtags without a localized name face issues.
- Localization teams for languages without a 2-letter language code must get their 3-letter code added manually to all locales.
Desired state
- Localization teams only localize language names (etc.) that are commonly localized in their locale. Otherwise, they default to the value on the master list.
- To simplify the work of localizers, a list of "commonly localized" language names should be provided, extending the list currently in languageNames.properties (perhaps according to Kevin Scannell's suggestions in Bug 356038, Comment 37)
- Any valid combination of subtags (per BCP 47) can create a Firefox localization team and locale.
- Locales that are not valid subtag combinations are phased out (e.g. 'jp-JP-mac').
Spellchecking
Current state
- There are a limited number of built-in spellcheckers.
- Additional spellcheckers can be added as extensions.
- Spellcheckers are not approved on AMO if there is not a respective language name string in Firefox.
Desired state
- Have built-in language names for all languages acknowledged by the IANA.
- Allow spellcheckers (and AMO) to use master list for default language names.
Font negotiation
(this needs work)
Current state
- Some sort of private-use language tags like 'x-western' are used for font categories.
Desired state
- Better font negotiation, keeping in mind the forward-compatible drive towards UTF-8.
- Would probably involve the 'Script Subtag' and possibly-associated 'Suppress-Script' value.