Necko/Cache/Plans: Difference between revisions

From MozillaWiki
< Necko‎ | Cache
Jump to navigation Jump to search
(dummy edit)
 
(36 intermediate revisions by 6 users not shown)
Line 1: Line 1:
= Type 1: Locking / Initialization Performance / Asynchronicity =
= New Cache Plans =


These are run-time items, things that don't involve changing anything about the cache on-disk.
We have decided to rewrite our HTTP disk cache.


== Open ==
== People ==


* {{nbug|717761}} - Main thread can be blocked by IO on the cache thread
The design team will be responsible for coming up with a design for the new disk cache. The design should be thorough and well-documented. Once the design team is satisfied with an initial design document, the implementation team will start implementing.
* {{nbug|701909}} - Disk cache seems to cause exceptionally slow startups(1min+)
* {{nbug|723362}} - Make nsCacheEntryDescriptor::Doom asynchronous
* {{nbug|695399}} - Remove synchronous cache API (nsICacheSession::openCacheEntry)
** Alternatively leave them but entirely disabled on the main thread.
* {{nbug|761736}} - CACHE_SERVICE_LOCK_MAINTHREAD possibly regressed
* {{nbug|744388}}, {{nbug|762094}} - New tab page shouldn't use synchronous cache APIs.
* {{nbug|763555}} - Do not do I/O while holding the cache lock


== Closed ==
Design team:
* Michal Novotny
* Taras Glek
* Steve Workman
* Honza Bambas
* Nick Hurley
* Brian Bondy
* Doug Turner
* Patrick McManus
* Steve Workman


* {{nbug|716293}} - IO on the main thread caused by nsDiskCacheOutputStream::Close()
<br>Implementation team:
* {{nbug|722034}} - Cache entry check--CheckCache()--is done on the main thread, causing the main thread to wait on the cache service lock, which may be blocked on disk I/O
* Honza Bambas
* {{nbug|722033}} - Remove calls to synchronous openCacheEntry in nsHttpChannel
* Michal Novotny


= Type 2: Deleting / Moving / Sizing Entire Cache =
== Primary Design Goals ==


These items involve our ability to delete and move the disk cache as a whole, treating it as "black box."
This section documents issues that need to be addressed in the new cache's design.


== Open ==
* Version API for the cache so we can update easily.
* All APIs should be async. No main-thread locking or i/o at all.
* A crash or abnormal program termination should not invalidate the entire cache.
* Support gzip compression. Meta-data should say whether a file is gzip'd or not, can choose to write compressed or uncompressed data on a per-file basis at runtime. Pass through files gzip'd from the network.
* Make use of fallocate.
* Minimize API surface, especially for APIs exposed to JS/extensions. All exposed APIs should have a clear, safe use case.
* Consider eliminating memory cache.
* Competing ideas:
** Temporal layout so that sub-resources are together.
** Don't over-optimize on-disk storage, use one file per entry and let OS optimize.
* Layered design should include XPCOM API, C++ API exposed to Gecko, middle layer with general cache logic, and back-end allowing for alternative on-disk formats.
* Separate services for HTTP and offline cache? Find a way to make these use cases work well without over-complicating code.
* Browser should behave properly with disk cache entirely disabled.
* Allow for effectively racing cache against network, so as to not wait serially.
* Use this very same cache for more general meta-like data, e.g. cache hosts for DNS prewarms, appcache namespaces + its other data and versioning, any useful host specific data we now getter in memory and throw away after restart (SPDY preference, TLS tolerance, pipeline successful test, etc...)


* Dynamic sizing optimization. Need bug number here.
== Success Metrics ==
* Allow cache to be deleted out from under us so that we can put cache in designated purge-able cache directories.
* Make sure we are deleting unwanted caches as efficiently and with as little disruption as possible.


== Closed ==
This section documents the ways in which we'll determine whether or not the new cache design is a success.


= Type 3: On-Disk Layout Changes =
* Should not be possible to trigger main-thread i/o.
* Create telemetry for with-cache and without-cache. For top 50% cache should be faster than no cache, for low 50% cache should be faster than no cache.


These items involve changing the on-disk layout for the cache, changing things inside the "black box."
== API ==


== Open ==
This section documents the APIs for interacting with the new disk cache.


* {{nbug|105843}} - Cache lost if Mozilla crashes
[[Necko/Cache/Plans/Draft Proposal|API Changes proposal]]
* {{nbug|715714}} - Adapt cache based on io rates
* {{nbug|715752}} - Add version numbers to cache
* Disk layout optimizations for different device types (SSD vs. HDD)


== Closed ==
=== XPCOM APIs (exposed to JS) ===


= Prioritization =
=== C++ APIs (exposed to Necko) ===


Only bug numbers go in this list. If it doesn't have a bug it can't be a priority. Any bugs in this list should also be in an above bucket with a description.
== Locking ==


# {{nbug|717761}}
This section describes how locking will work in the new disk cache. Ideally this should document every lock that will be necessary in the new cache.
# {{nbug|105843}}
 
== On-Disk Layout ==
 
This section describes the on-disk layout of the disk cache. It may describe a default on-disk layout and any number of alternatives required for the first revision of the new disk cache.

Latest revision as of 18:40, 14 June 2022

New Cache Plans

We have decided to rewrite our HTTP disk cache.

People

The design team will be responsible for coming up with a design for the new disk cache. The design should be thorough and well-documented. Once the design team is satisfied with an initial design document, the implementation team will start implementing.

Design team:

  • Michal Novotny
  • Taras Glek
  • Steve Workman
  • Honza Bambas
  • Nick Hurley
  • Brian Bondy
  • Doug Turner
  • Patrick McManus
  • Steve Workman


Implementation team:

  • Honza Bambas
  • Michal Novotny

Primary Design Goals

This section documents issues that need to be addressed in the new cache's design.

  • Version API for the cache so we can update easily.
  • All APIs should be async. No main-thread locking or i/o at all.
  • A crash or abnormal program termination should not invalidate the entire cache.
  • Support gzip compression. Meta-data should say whether a file is gzip'd or not, can choose to write compressed or uncompressed data on a per-file basis at runtime. Pass through files gzip'd from the network.
  • Make use of fallocate.
  • Minimize API surface, especially for APIs exposed to JS/extensions. All exposed APIs should have a clear, safe use case.
  • Consider eliminating memory cache.
  • Competing ideas:
    • Temporal layout so that sub-resources are together.
    • Don't over-optimize on-disk storage, use one file per entry and let OS optimize.
  • Layered design should include XPCOM API, C++ API exposed to Gecko, middle layer with general cache logic, and back-end allowing for alternative on-disk formats.
  • Separate services for HTTP and offline cache? Find a way to make these use cases work well without over-complicating code.
  • Browser should behave properly with disk cache entirely disabled.
  • Allow for effectively racing cache against network, so as to not wait serially.
  • Use this very same cache for more general meta-like data, e.g. cache hosts for DNS prewarms, appcache namespaces + its other data and versioning, any useful host specific data we now getter in memory and throw away after restart (SPDY preference, TLS tolerance, pipeline successful test, etc...)

Success Metrics

This section documents the ways in which we'll determine whether or not the new cache design is a success.

  • Should not be possible to trigger main-thread i/o.
  • Create telemetry for with-cache and without-cache. For top 50% cache should be faster than no cache, for low 50% cache should be faster than no cache.

API

This section documents the APIs for interacting with the new disk cache.

API Changes proposal

XPCOM APIs (exposed to JS)

C++ APIs (exposed to Necko)

Locking

This section describes how locking will work in the new disk cache. Ideally this should document every lock that will be necessary in the new cache.

On-Disk Layout

This section describes the on-disk layout of the disk cache. It may describe a default on-disk layout and any number of alternatives required for the first revision of the new disk cache.