Necko/Cache/Plans: Difference between revisions

From MozillaWiki
< Necko‎ | Cache
Jump to navigation Jump to search
(dummy edit)
 
(13 intermediate revisions by 5 users not shown)
Line 1: Line 1:
= New Cache Plans =
= New Cache Plans =
We have decided to rewrite our HTTP disk cache.


== People ==
== People ==
We have decided to rewrite the HTTP disk cache.


The design team will be responsible for coming up with a design for the new disk cache. The design should be thorough and well-documented. Once the design team is satisfied with an initial design document, the implementation team will start implementing.
The design team will be responsible for coming up with a design for the new disk cache. The design should be thorough and well-documented. Once the design team is satisfied with an initial design document, the implementation team will start implementing.
Line 10: Line 10:
* Michal Novotny
* Michal Novotny
* Taras Glek
* Taras Glek
* Steve Workman?
* Steve Workman
* Honza Bambas
* Honza Bambas
* Nick Hurley
* Nick Hurley
Line 16: Line 16:
* Doug Turner
* Doug Turner
* Patrick McManus
* Patrick McManus
* Josh Aas
* Steve Workman
 
<br>Implementation team:
* Honza Bambas
* Michal Novotny


Implementation team:
== Primary Design Goals ==
* Honza Bambas?
* Steve Workman?


== Design Requirements ==
This section documents issues that need to be addressed in the new cache's design.


* Version API for the cache so we can update easily.
* Version API for the cache so we can update easily.
* All APIs should be async. No main-thread locking or i/o at all.
* All APIs should be async. No main-thread locking or i/o at all.
* A crash should not invalidate the entire cache.
* A crash or abnormal program termination should not invalidate the entire cache.
* Support gzip compression. Meta-data should say whether a file is gzip'd or not, can choose to write compressed or uncompressed data on a per-file basis at runtime. Pass through files gzip'd from the network.
* Support gzip compression. Meta-data should say whether a file is gzip'd or not, can choose to write compressed or uncompressed data on a per-file basis at runtime. Pass through files gzip'd from the network.
* Make use of fallocate.
* Make use of fallocate.
* Restrictive XPCOM API, start with just "clear cache." Require a use case to add to the API.
* Minimize API surface, especially for APIs exposed to JS/extensions. All exposed APIs should have a clear, safe use case.
* Effectively eliminate memory cache.
* Consider eliminating memory cache.
* Competing ideas:
* Competing ideas:
** Temporal layout so that sub-resources are together.
** Temporal layout so that sub-resources are together.
** Don't over-optimize on-disk storage, use one file per entry and let OS optimize.
** Don't over-optimize on-disk storage, use one file per entry and let OS optimize.
* Support the concept of cache devices?
* Layered design should include XPCOM API, C++ API exposed to Gecko, middle layer with general cache logic, and back-end allowing for alternative on-disk formats.
* Layered design should include XPCOM API, C++ API exposed to Gecko, middle layer with general cache logic, and back-end allowing for alternative on-disk formats.
* Separate services fro HTTP and offline cache? Find a way to make these use cases work well without over-complicating code.
* Separate services for HTTP and offline cache? Find a way to make these use cases work well without over-complicating code.
* Browser should behave properly with disk cache entirely disabled.
* Browser should behave properly with disk cache entirely disabled.
== Performance Goals ==
* Primary performance target is mobile - Android and Firefox OS. On-disk layout must optimize for this.
* Make sure performance on spinning disks is good, but top performance here is not the priority.
* Allow for effectively racing cache against network, so as to not wait serially.
* Allow for effectively racing cache against network, so as to not wait serially.
* Use this very same cache for more general meta-like data, e.g. cache hosts for DNS prewarms, appcache namespaces + its other data and versioning, any useful host specific data we now getter in memory and throw away after restart (SPDY preference, TLS tolerance, pipeline successful test, etc...)


== Success Metrics ==
== Success Metrics ==


* Same hit rate as current cache, but do things in parallel.
This section documents the ways in which we'll determine whether or not the new cache design is a success.
* No on-disk i/o.
 
* Should not be possible to trigger main-thread i/o.
* Create telemetry for with-cache and without-cache. For top 50% cache should be faster than no cache, for low 50% cache should be faster than no cache.
* Create telemetry for with-cache and without-cache. For top 50% cache should be faster than no cache, for low 50% cache should be faster than no cache.


== API ==
== API ==


API design goes here.
This section documents the APIs for interacting with the new disk cache.
 
[[Necko/Cache/Plans/Draft Proposal|API Changes proposal]]
 
=== XPCOM APIs (exposed to JS) ===
 
=== C++ APIs (exposed to Necko) ===


== Locking ==
== Locking ==


Locking plans go here. Clearly detail what locks the cache will have, and strategies to avoid lock contention in performance-critical situations.
This section describes how locking will work in the new disk cache. Ideally this should document every lock that will be necessary in the new cache.


== On-Disk Layout ==
== On-Disk Layout ==


Clearly describe on-disk layout here.
This section describes the on-disk layout of the disk cache. It may describe a default on-disk layout and any number of alternatives required for the first revision of the new disk cache.

Latest revision as of 18:40, 14 June 2022

New Cache Plans

We have decided to rewrite our HTTP disk cache.

People

The design team will be responsible for coming up with a design for the new disk cache. The design should be thorough and well-documented. Once the design team is satisfied with an initial design document, the implementation team will start implementing.

Design team:

  • Michal Novotny
  • Taras Glek
  • Steve Workman
  • Honza Bambas
  • Nick Hurley
  • Brian Bondy
  • Doug Turner
  • Patrick McManus
  • Steve Workman


Implementation team:

  • Honza Bambas
  • Michal Novotny

Primary Design Goals

This section documents issues that need to be addressed in the new cache's design.

  • Version API for the cache so we can update easily.
  • All APIs should be async. No main-thread locking or i/o at all.
  • A crash or abnormal program termination should not invalidate the entire cache.
  • Support gzip compression. Meta-data should say whether a file is gzip'd or not, can choose to write compressed or uncompressed data on a per-file basis at runtime. Pass through files gzip'd from the network.
  • Make use of fallocate.
  • Minimize API surface, especially for APIs exposed to JS/extensions. All exposed APIs should have a clear, safe use case.
  • Consider eliminating memory cache.
  • Competing ideas:
    • Temporal layout so that sub-resources are together.
    • Don't over-optimize on-disk storage, use one file per entry and let OS optimize.
  • Layered design should include XPCOM API, C++ API exposed to Gecko, middle layer with general cache logic, and back-end allowing for alternative on-disk formats.
  • Separate services for HTTP and offline cache? Find a way to make these use cases work well without over-complicating code.
  • Browser should behave properly with disk cache entirely disabled.
  • Allow for effectively racing cache against network, so as to not wait serially.
  • Use this very same cache for more general meta-like data, e.g. cache hosts for DNS prewarms, appcache namespaces + its other data and versioning, any useful host specific data we now getter in memory and throw away after restart (SPDY preference, TLS tolerance, pipeline successful test, etc...)

Success Metrics

This section documents the ways in which we'll determine whether or not the new cache design is a success.

  • Should not be possible to trigger main-thread i/o.
  • Create telemetry for with-cache and without-cache. For top 50% cache should be faster than no cache, for low 50% cache should be faster than no cache.

API

This section documents the APIs for interacting with the new disk cache.

API Changes proposal

XPCOM APIs (exposed to JS)

C++ APIs (exposed to Necko)

Locking

This section describes how locking will work in the new disk cache. Ideally this should document every lock that will be necessary in the new cache.

On-Disk Layout

This section describes the on-disk layout of the disk cache. It may describe a default on-disk layout and any number of alternatives required for the first revision of the new disk cache.