Firefox/Projects/FTS and Awesomebar: Difference between revisions
(→Milestones: Hmm, be more realistic about milestone dates.) |
|||
Line 33: | Line 33: | ||
* 2010/03 - [Complete] Get ICU up and running with our SQLite, Storage | * 2010/03 - [Complete] Get ICU up and running with our SQLite, Storage | ||
* 2010/03 - Test potential perf impact on awesomebar searches | * 2010/03 - Test potential perf impact on awesomebar searches | ||
* 2010/ | * 2010/06 - Integrate the required ICU components with our tree, build system | ||
* 2010/ | * 2010/08 - Integrate awesomebar with SQLite's fts extension using ICU tokenizer | ||
== Delivery Requirements == | == Delivery Requirements == |
Revision as of 04:11, 13 March 2010
SQLite supports indexed full text search (FTS) through its fts extensions. Indexed FTS makes text searches fast: Rather than looking at every record in the database to see if it contains a search string, the target records are found by comparing the search string against an index.
But by default the fts extensions aren't suitable for international text. This project will try to make them suitable so we can use indexed FTS to improve all of our users' experiences with Firefox. A good first target feature in Firefox is the awesomebar.
Note that full text search does not mean that we store the entire text of pages. We will use it to store page titles, URLs, tags, etc., as we already do now, only lookup will be fast.
- Champion, lead: adw
Status
TAKING OFF
- Gecko has some facilities for i18n word boundary analysis, but they're not comprehensive or suitable for Firefox's 300 million users. Too bad.
- Thunderbird does FTS. I talked with asuth about it, and unfortunately their i18n tokenizer doesn't seem appropriate for us either.
- Investigating pulling some components of ICU into our tree. ICU is a large, established i18n library that has facilities for word boundary analysis and tokenization. SQLite supports an ICU tokenizer out of the box.
- I was able to build our SQLite with ICU support. It works! I'm building and linking against an ICU build outside of our tree, because I don't want to focus on the grunt work of building and linking inside the tree with our tools right now. I assume doing so is not impossible...
- Next I would like to do some tests to see its potential for improving awesomebar searches.
Goals
- Make awesomebar results come back from the database faster. (Note that async awesomebar prevents the UI from locking up, which is great, but it doesn't make the database queries any faster.)
Non Goals
- Improving user-facing features in Firefox other than the awesomebar. That's for follow-up work.
- Pulling in parts of ICU (or any other lib) not required for i18n FTS.
Milestones
Note: Dates in the future are only estimates.
- 2010/03 - [Complete] Investigate i18n tokenizers
- 2010/03 - [Complete] Get ICU up and running with our SQLite, Storage
- 2010/03 - Test potential perf impact on awesomebar searches
- 2010/06 - Integrate the required ICU components with our tree, build system
- 2010/08 - Integrate awesomebar with SQLite's fts extension using ICU tokenizer
Delivery Requirements
- Testing to make sure awesomebar functionality and certainly perf is not regressed.
Constraints
- Have to convince people that pulling in parts of ICU is worth it. Expect pushback...
Dependencies
- Since this project is broadly defined -- improving FTS all the way to using FST in the awesomebar -- none.
Testing
- Will require manual testing of the awesomebar.
- Maybe we can set up some automated harness to time awesomebar searches.