Account confirmers, Anti-spam team, Confirmed users, Bureaucrats and Sysops emeriti
4,925
edits
No edit summary |
|||
Line 22: | Line 22: | ||
has not applied for inclusion, or because we do not think they have sufficiently | has not applied for inclusion, or because we do not think they have sufficiently | ||
strong protections in place. In addition, ICANN is about to open | strong protections in place. In addition, ICANN is about to open | ||
a [http://newgtlds.icann.org/ large number of new TLDs]. So either maintaining a whitelist is going to become | a [http://newgtlds.icann.org/en/program-status/application-results/strings-1200utc-13jun12-en large number of new TLDs]. So either maintaining a whitelist is going to become | ||
burdensome, or the list will become wildly out of date and we will not | burdensome, or the list will become wildly out of date and we will not | ||
be serving our users. | be serving our users. | ||
Line 63: | Line 63: | ||
* Common + Inherited + Latin + Han + Bopomofo; or | * Common + Inherited + Latin + Han + Bopomofo; or | ||
* Common + Inherited + Latin + Han + Hangul; or | * Common + Inherited + Latin + Han + Hangul; or | ||
* Common + Inherited + Latin + any single other script except Cyrillic | * Common + Inherited + Latin + any single other "Recommended" or "Aspirational" script except Cyrillic or Greek | ||
</blockquote> | </blockquote> | ||
[http://www.unicode.org/reports/tr39/#Mixed_Script_Detection Unicode Technical Report 39] gives | [http://www.unicode.org/reports/tr39/#Mixed_Script_Detection Unicode Technical Report 39] gives | ||
a definition for how we detect whether a string is "single script". Some Common or Inherited characters | a definition for how we detect whether a string is "single script". | ||
Some Common or Inherited characters | |||
are only used in a small number (but more than one) script. Mark Davis writes: | are only used in a small number (but more than one) script. Mark Davis writes: | ||
"The Unicode Consortium in U6.1 (due out soon) is adding the property [http://unicode.org/Public/6.1.0/ucd/ScriptExtensions.txt Script_Extensions], | "The Unicode Consortium in U6.1 (due out soon) is adding the property [http://unicode.org/Public/6.1.0/ucd/ScriptExtensions.txt Script_Extensions], | ||
to provide data about characters which are only used in a few (but more than one) script. | to provide data about characters which are only used in a few (but more than one) script. | ||
The sample code in #39 should be updated to include that, so handling such cases." | The sample code in #39 should be updated to include that, so handling such cases." | ||
This data is now available, but not yet in the Firefox platform. In the mean time, Common and Inherited | |||
characters are permitted without restriction. | characters are permitted without restriction. | ||
We also implement additional checks, as suggested by TR #39 sections 5.3 and 5.4: | |||
* Display as Punycode labels which use more than one numbering system | * Display as Punycode labels which use more than one numbering system |