Confirmed users
764
edits
No edit summary |
(→Status) |
||
Line 15: | Line 15: | ||
TAKING OFF | TAKING OFF | ||
* Gecko has some [http://mxr.mozilla.org/mozilla-central/source/intl/lwbrk/idl/nsISemanticUnitScanner.idl facilities] for i18n word boundary analysis, but they're not comprehensive or suitable for Firefox's 300 million users. Too bad. | * Gecko has some [http://mxr.mozilla.org/mozilla-central/source/intl/lwbrk/idl/nsISemanticUnitScanner.idl facilities] for i18n word boundary analysis, but they're not comprehensive or suitable for Firefox's 300 million users. (They aren't even used anywhere in the tree.) Too bad. | ||
* Thunderbird does FTS. I talked with asuth about it, and unfortunately their i18n tokenizer doesn't seem appropriate for us either. | * Thunderbird does FTS. I talked with asuth about it, and unfortunately their i18n tokenizer doesn't seem appropriate for us either. | ||
* Investigating pulling some components of [http://site.icu-project.org/ ICU] into our tree. ICU is a large, established, and widely used i18n library that has facilities for word boundary analysis and tokenization. SQLite supports an ICU tokenizer out of the box. | * Investigating pulling some components of [http://site.icu-project.org/ ICU] into our tree. ICU is a large, established, and widely used i18n library that has facilities for word boundary analysis and tokenization. SQLite supports an ICU tokenizer out of the box. |