Gaia/System/Updates

From MozillaWiki
Jump to navigation Jump to search

Overview

There are several types of FxOS updates:

  • Gonk (Kernel)
  • Gecko
  • Gaia
  • Apps
    • Core
    • Pre-installed third party
    • User-installed
      • Packaged
      • Non-packaged

Development/Software Channels

  • The requirement is to offer 3 channels for delivering the B2G software stack to users. Each channel will include consistent versions of Gonk, Gecko and Gaia. The proposed channels are:
    • Nightly
    • Beta
      • Weekly stable builds that users can stay on to get the latest bugs fixes and feature enhancements.
    • Release
  • All phones manufactured should default to the Release channel.

References


Gonk (Kernel) Updates

Overview

Changes to this layer in the software stack should only be made when it is absolutely necessary. The requirement here is to ensure a stable build at this level of the software to minimize any issues with the phone at the Gaia level.

  • Gonk updates are synonymous with Full-System and FOTA.
  • FOTA = Firmware Over-the-Air
  • Full System = atomically update Gonk+Gecko+Gaia.
  • Require rebooting device (because Gonk requires reboot).
  • Update takes several minutes, during which device is unusable.
  • Will only used for security bugs in Gonk that cannot be fixed in Gecko or Gaia.
  • Ideally are never needed for shipping devices.
  • Frequency:
    • Immediate for critical security bugs.
    • Quarterly for any non-critical security bugs, if needed. If there are no bug fixes in a given quarter, there is no quarterly update.
  • User can view currently version information from Settings > Device > Device Information.

UX Requirements

WIP notes, Josh Carpenter, Aug 15

  • Need to find right balance of automatic versus user-approved.
  • Prompt user when update is available?
  • Prompt user to approve download? eg: "download now" [or] "download later" (placeholder verbiage). Or initiate automatically when certain condition is met (eg: connected to WiFi, and device is inactive).
  • After download complete, prompt user to install? eg: "Install now" [or] "Install later" (placeholder verbiage).
  • Display download progress to user (progress bar)?
  • Display install progress to user (progress bar)?
  • Minimize the risk of unsuccessful install, the consequences of which can include bricking the device.
    • Detect battery life and require user first connect to power source if below X%?
    • Detect low-available device storage and prompt user to clear files before they can proceed?
    • Detect interrupted download (eg: user turns on Airplane mode, user powers down device, etc) and seamlessly complete (or restart) later?

Questions

Open

  • Risks: what causes botched installs?
    • Battery dies in process?
    • Insufficient storage?
    • Corrupted update files?
  • How do we mitigate those risks?
  • Once update has downloaded and user has opted to install, what is the install sequence? eg: Shut down to blank screen, show progress indicator,
  • Backup: is there a rollback feature for failed installs? (Sounds like answer is no).
  • File size: Approximately how large are these updates? Do we check for available device storage before downloading? If insufficient, how do we mitigate?
  • If user defers installation of update:
    • ...presumably they cannot receive subsequent Gecko + Gaia updates?
    • ...do we remind user to reinstall later on?
    • ...and/or automatically install at future point (eg: next time the device reboots)
  • Can we confirm that Full-System updates will be free via Carrier's private APN?

Answered

  • What is time to install? Several minutes?
    • (cjones from IRC): Several minutes, during which device is unusable.
  • Do we require the user to plug in the phone if battery is below X %?
    • (sicking) This is somewhat technically doable, but a UX decision if we actually want to. What we can't do is estimate how good the battery is. I think an old "worn out" battery will fall much quicker in battery level than a new one. It's also hard to get a good estimate of how much battery is needed, but we can certainly require battery levels much higher than what's needed (say 30%).
  • Can we avoid user friction by downloading silently, and only during periods of user inactivity? Would not want to slow down web browsing while update processed in background, for example.
    • (sicking) We don't have good mechanisms for giving lower priority or throttling individual channels in Necko right now. We technically could detect other downloads starting through and abort the update in that case, and then resume once we detect user inactivity.
  • Do we provide link to changelog so user can review update details before installing?
    • (sicking) I think this is up to UX to set requirements. Seems technically feasible, but I'm not convinced it's needed given that in every instance we'll likely want to apply the update for security reasons.
  • If the user powers down the device while an update is silently downloading in background, can we resume download later on?
    • (sicking) Necko has the ability to download ranges, but this also needs server support. We certainly could require such support though.


Gecko Updates

Overview

  • Will be updated via:
    • Full-System FOTA updates (Gonk+Gecko+Gaia) [or]
    • Atomically w/ Gaia (Gonk not included)
  • Frequency:
    • Based on input from carriers and OEMs, it is likely that updates will need to happen somewhere between the current Desktop (6 week) and Extended Support Release (42 week) intervals. The proposed requirement is to offer regular updates every 18 weeks. This frequency offers Mozilla and our partners the ability to update functionality on the device at a quicker pace than competitor OS stacks (iOS and Android), but at the same time not overwhelm our Carrier partners who may not be used to updating software so frequently.
  • Backups:
    • Requirements is to offer a back-up instance of Gecko to ensure we can failover when necessary (if we somehow shipped an updated Gecko version that resulted in a bug).
    • Probably a Phase 2 item.
  • Cost: The user will not be charged any carrier network fees for any Gecko update, as agreed upon by Telefonica, who will be making these updates via a private APN.
  • User can view currently version information from Settings > Device > Device Information.
  • Update process:
    • Download
      • Files will be binary diff'd to minimize size.
      • If the interrupts download (turns of Airplane mode, powers off device, etc) we can theoretically complete at later time.
      • Necko has the ability to download ranges, but this also needs server support. We certainly could require such support though. (as per Jonas)
    • Install
      • Time to apply will vary depending on size of update, internal disk speed, device hardware spec, etc
      • Will require restarting process (aprox 10 seconds), but not rebooting the device. Reboot _may_ be required as fail safe if /system is somehow left in read-write after the updater is finished.
      • Battery life: We can detect current battery level, but not drain rate (which varies with battery age). It will also be difficult to estimate the amount of power required to complete an install. Therefore we should build healthy margins into any "minimum required battery" thresholds.
  • User prompts:
    • ...

Bugs

Questions

Open

  • Do we need to have a mechanism for pushing extra-critical updates?
  • Do we check for available device storage before downloading? If insufficient, how do we mitigate?
  • How much user agency do we provide over installs? Can they defer? For how long? What affordances do we make for out of date Gecko?
  • How can the user review the currently-installed version? From Settings?

Answered

  • What is the sequence of events (eg: prompt user to restart device, whereupon install process runs?)
    • (marshall_law)
    • For user prompting sequence / rules, see the requirements laid out by cjones in these bugs:
    • Currently, the plan is to automatically download any Gecko updates to /data/local/updates. IIRC we are working w/ carriers to make sure this isn't billable data.
    • After an update is downloaded, the update is staged in /system/b2g/updated, and the b2g process is cleanly shutdown (and then restarted by the system)
    • Soon after bootup of the new b2g process, the updater runs again, copying the staged updates into /system/b2g to make them live.
    • Note: The updater process will re-mount /system as read-write when it starts, and back to read-only when it exits. This will happen for both staging, and copying the staged files in place. In the event of a failure remounting /system as read-only (before exit), the device will be restarted, allowing the system to mount /system as read-only. See https://bugzilla.mozilla.org/show_bug.cgi?id=764683 for more details
    • (/marshall_law)
  • How often should we check for Gecko update? CLee's etherpad outline says every 18 weeks.
    • (sicking) This sounds too rare given that we ship Gecko updates every 6 weeks which always contain security updates. Usually we count on security researchers being able to reverse engineer those updates and create exploits for older versions of Gecko.
    • (marshall_law) IIRC the reasoning had to do with risk mitigation from carriers (it was a compromise of some kind?)
  • Can we confirm that there will be a back-up instance of Gecko in event of failed updates? If so, what will sequence of it's application be?
  • Does being on 3G/Edge affect when we check for Gecko updates?
    • Should not, since updates will be free OTA via Carrier's private APN.
  • What—if anything—should we tell the user when a Gecko updated is detected? Should we behave differently if the user is on 3G/Edge connection when we detect that an update is available?
    • Download update and apply silently in background, same as Gonk process? Might be too intrusive for these more frequent (18 weeks) Gecko updates?
    • (marshall_law) see above
  • Do we have the technical ability to download the update in the background and only notify the user once the new version is available?
    • (marshall_law) yes
  • Should we do anything special if the phone has been turned off for a few days and is then started?
    • (marshall_law) IIRC, the existing Gecko update internals already have the logic for knowing how long it's been since the last update check. They should be able to detect this and do an update check as soon as the phone is on. We should confirm this, though.
  • Should we inform users about how big updates will be before downloading them? Do we have the ability to tell before doing the actual download?
    • (marshall_law) We do have the ability -- the update MAR format specifies update size. We don't currently have plans to inform about the size of an update, but feel free to chime in on any of the bugs listed on how that might work.
  • Do we require the user to plug in the phone if battery is below X %?
    • (sicking) See answer for Gonk
  • What prompts do we present to user?
    • (marshall_law) see above
  • Do we have a rollback strategy for failed installs? Previous April discussion w/ cjones indicated no...
    • (marshall_law) we do in Phase 2, which I don't think we plan to have for v1. Will need to check w/ cjones to verify though
  • What is time to install? Several minutes?
    • (marshall_law) It all depends on the size of the update, the speed of the internal disk, the device hardware.. :) We might want to profile this in a few configurations..
  • Can we avoid user friction by downloading silently, and only during periods of user inactivity? Would not want to slow down web browsing while update processed in background, for example.
    • (marshall_law) yes, see the bugs about prompting
  • How many device reboots are required in the process?
    • (marshall_law) for Gecko, there shouldn't be any. only a process restart is required. the only time a restart might happen is if /system is somehow left in read-write after the updater is finished, as a fail safe.
  • How large are these updates?
    • (marshall_law) they are binary diff'd, so potentially not "huge", but again this all depends on how big the update is. definitely smaller than if we were downloading fresh binaries.
  • If the user powers down the device while an update is silently downloading in background, can we resume download later on?
    • (sicking) See answer for Gonk


Gaia Updates

Introduction

  • Gaia updates are related to anything that may modify the user interface and experience of the OS. The update interval will also be every 18-weeks to align with Gecko updates. (to be clear, Gaia apps will be updated as part of Gecko update correct? [see below] - ladamski)
  • Updates that happen to the Core Apps (Dialer, SMS, Camera, etc.) will happen silently and users will not be charged any carrier network fees for Gaia System and Core App updates (similar to Gecko updates via a private APN). All core apps should be updated simultaneously so that a single B2G version represents the full stack.

Open questions

  • How much are we testing Core app updates before delivering these updates to the APN?
    • This will be tested at the same level we are testing starndard 3G connectivity to the carrier network? From CLee's etherpad outline...
  • Same questions from Gecko apply here:
  • What is the sequence of events (eg: prompt user to restart device, whereupon install process runs?)
    • How different is process from Gecko updates?
  • How often should we check for updates?
  • Will there be a back-up instance of Gaia in event of failed updates? If so, what will sequence of it's application be?
  • Does being on 3G/Edge affect when we check for Gecko updates?
    • Should not, since updates will be free OTA via Carrier's private APN.
  • What—if anything—should we tell the user when a Gecko updated is detected? Should we behave differently if the user is on 3G/Edge connection when we detect that an update is available?
    • Download update and apply silently in background?
  • Do we have the technical ability to download the update in the background and only notify the user once the new version is available?
  • Should we do anything special if the phone has been turned off for a few days and is then started?
  • Do we need to have a mechanism for pushing extra-critical updates?
  • Should we inform users about how big updates will be before downloading them? Do we have the ability to tell before doing the actual download?
  • Do we require the user to plug in the phone if battery is below X %?
  • What prompts do we present to user?
  • Do we have a rollback strategy for failed installs? Previous April discussion w/ cjones indicated no...
  • What is time to install? Several minutes?
  • Can we avoid user friction by downloading silently, and only during periods of user inactivity? Would not want to slow down web browsing while update processed in background, for example.
  • How many device reboots are required in the process?
  • Do we provide link to changelog so user can review update details before installing?
  • How large are these updates?
  • Do we check for available device storage before downloading? If insufficient, how do we mitigate?
  • If the user powers down the device while an update is silently downloading in background, can we resume download later on?
  • How much user agency do we provide over installs? Can they defer? For how long? What affordances do we make for out of date Gaia + Core apps?
  • How can the user review the currently-installed version? From Settings?


Draft process for atomic Gecko+Gaia updates

Josh Carpenter, Aug 15

Overview

Design principles

  • Low-friction. Minimize user interruptions, connection speed impacts, etc.
  • Free. Avoid user charges.
  • Safe. Minimize changes and consequences of failed updates.
  • Patient. Support backwards compatibility for users who cannot update.
  • Friendly. Avoid presenting users with excess technical details.

Steps

Diagram: Gecko+Gaia Update Process v1 (PDF)

  1. Check for update
  2. Confirm available drive space
  3. Check connection
  4. Download
  5. Check Battery
  6. Install
  7. Follow-up

1. Check for update

Automatic (push)

  • Update server pushes silent "update available" notification to device.

Automatic (poll)

  • Device checks with server for available update at scheduled time/interval.

Manually

  • User initiates "Check for Updates" via UI input. Probably from Settings > Device > Device Information (although this has not been specced yet).

2. Confirm available drive space

  1. Check size of download.
  2. Check there is sufficient storage on device to download the update. Define minimum sufficient storage as some multiple of the update file size, in order to leave sufficient room for day to day device operation.

Sufficient

Proceed to next step.

Insufficient

Two possibilities: fail silently, or user prompt.

Fail Silently

Fail silently if update process was initiated Automatically (Push or Poll).

  • Update fails and goes into Wait state.
  • X minutes/hours/days (?) elapse -> Check sufficient storage again.

Prompt

Prompt user if update process was initiated Manually, and device is On and Unlocked. Contents as follows:

  • Image: Icon
  • Title: "Update available"
  • Body: "A FxOS update is available there is not enough space to download it. Free up space by deleting videos, files, etc."
  • Input: "OK"

User presses [OK]:

  • Update fails and goes into Wait state.
  • X minutes/hours/days (?) elapse -> Check sufficient storage again.

Because these prompts interrupt the user, we should consider a nuanced approach. For example, upon second prompt give user option to not be reminded again (although that creates fragmentation problems), or to choose between two varying reminder intervals (eg: 1 Day and 1 Week).

3. Check connection

Ensure updates are free for user by taking into account connection type before downloading.

No connection

The Network Status is one of following:

  • Airplane Mode
  • Searching
  • No Network

Update process can encounter this when:

  • Process is Manually initiated.
  • Process was Automatically initiated, but delayed due to insufficient drive space, and connection type has changed in meantime.

The Update process fails. Two possibilities (same as the two possibilities under "Paid", below).

WiFi

Proceed to next step.

Free (APN)

Proceed to next step.

User is on paid data connection, probably Roaming. To avoid incurring charges, update fails. Two possibilities:

Silent

Fail silently if update process was initiated Automatically (Push or Poll).

  • Update fails and goes into Wait state.
  • Exit Wait state when one of following occurs:
    • Connection type changes (eg: upon connection type change push the new type to the Updater)
    • Time interval passes (eg: check connection type every X hours/days)
    • New update push notification is received. Restart update process.

Prompt

Prompt user if update process was initiated Manually, and device is On and Unlocked. Contents as follows:

  • Image: Icon
  • Title: "Cannot download update" (verbiage TBD)
  • Body: "There is no data connection. FxOS cannot be updated. Connect to a WiFi or Data connection." (verbiage TBD)
  • Input: "OK"

User presses [OK]:

  • Update fails and goes into Wait state.
  • Exit Wait state when one of following occurs:
    • Connection type changes (eg: upon connection type change push the new type to the Updater)
    • Time interval passes (eg: check connection type every X hours/days)
    • New update push notification is received. Restart update process.

There's a lot room to improve the flow of this scenario. eg:

  • Detect reason for failure (eg: Airplane mode vs Roaming) and tailor prompt accordingly (eg: present option to turn Airplane Mode off).
  • Check available WiFi connections and offer user chance to connect if an Open and/or Remembered network is detected.

4. Download

Once initiated, downloads are as follows:

  • Silent & invisible (no visible UI such as progress bars)
  • Pause for user network activity
  • Auto-complete in event of interruption. eg:
    • Connection type changes to Roaming.
    • Connection type changes to Airplane Mode.
    • User turns off device.
    • Battery dies.

We should also create error handler for: Storage space dropped below minimum threshold during update download. eg: While update is downloading, user copies video to device storage via USB.

Once download is complete, proceed to next step.

5. Check Battery

In order to minimize risk of failed update, check battery level before starting installation. Defer or proceed depending on % remaining.

Given the following caveats:

  • System can detect current battery level, but not drain rate (varies with battery age)...
  • It will be difficult to estimate the amount of power required to complete an install, given variations in device specs, update sizes, etc...

...we should build healthy margins into any "minimum required battery" thresholds. eg: Minimum 30%.

  • If minimum threshold is met, proceed to next step.
  • If minimum threshold is not met, two possibilities:

Fail Silently

Put installation into Wait state until future condition is met

Need to flesh this out further.

Prompt user

Prompt user to plug in to power source in order to proceed. Offer option to cancel, or initiate automatically as soon as power connection is detected.

Need to flesh this out further.

6. Install

Installation process begins. The are three possibilities, depending on the current state of the device:

Manual Install

  • The device is On and Unlocked.
  • Because the installations are disruptive (requires a B2G process restart), the user is first prompted to initiate or defer the installation.
  • User has option to install immediately, or defer install.

Need to flesh this out further.

Silent Install

Two possibilities:

Update when idle

  • The device is On, Locked, and the likelyhood of use is low.
  • Can occur when the download has completed while the device is locked, or when the user has chosen to Defer from a Manual Install prompt.
  • One possibility is to delay install until the early morning hours, when the user is least likely to be using the device.
  • The Install process executes without turning on the screen.

Update at power-On

  • The device powers-On from Off state
  • Can occur when the user has chosen to Defer from a Manual Install prompt, and the device has subsequently been powered Off. When the device is turned On, the device does a Battery check, and then installs automatically & silently.

In both cases, the user is only made aware that the update has occurred once it is complete, and they Unlock the device.

Need to flesh this out further.

7. Clean-up

After install, we can consider informing the user that the update has occurred, and offer details such as link to update details (eg: "What's new in FxOS").


App Updates

Introduction

Apps can be divided into two categories, each with their own update policies.

  • Core apps
  • Pre-installed 3rd party apps
  • User-installed apps

Core apps

About

  • Are all packaged apps
  • Are pre-installed with the OS
  • Are not user-removable
  • Will survive a factory reset
    • We ensure that Core apps are available after a factory reset by storing them in the System partition, in /system/b2g/webapps, instead of the usual /data/local/webapps.

Updates

  • Core apps updates are bundled with Gecko updates, and are therefore governed by Gecko update policies.


Pre-installed third party apps

About

  • These are 3rd party apps that come bundled with the phone.
  • Are governed by same rules as User-installed apps.

Updates

  • Are governed by same rules as User-installed apps.

Open questions

  • Can these be uninstalled by the user
  • What happens with these on a factory reset? Are they removed if installed? Re-spawn if uninstalled?

User-installed apps

About

  • Can be packaged or non-packaged
    • Differences between packaged and non-packaged apps should generally not be surfaced to the user.
    • For both types, we ensure that they stay up-to-date by checking for new versions at regular intervals.
  • Updates are not be free and will incur data charges when on the carrier’s network.

Non-packaged app updates

  • Non-packaged apps can specify the location of an offline-cache manifest to be loaded at install time. This offline-cache is subsequently updated.
  • Update availability is checked by polling the developer website.
  • We don't yet have the ability to tell a packaged app that an update is available.

Packaged app updates

  • Update availability is checked by polling the store to see whether a new package available.
  • We also have the ability to tell the app that an update is available.

Deferred download

  • We have the ability to download and install app updates while the previous versions are running. The new version is made available on app restart.

Open questions

  • How often should we check for app update?
    • Once a day, only on WiFi?
  • Does the frequency of usage affect how often we check for app updates?
  • Does being on 3G/Edge affect when we check for app updates?
  • What should we tell the user when an app update is detected?
  • What should we tell the user when an app update is detected while the app is running, or should we rely on the app to do so? (note that while we can inform a running app about an update being available, we can't detect if the app is actually doing anything useful with that information)
  • Should we behave differently if the user is on 3G/Edge connection when we detect that an update is available?
  • Can the user inspect permissions enumerated in the app at the time of installation? Should we let the user know if an update expands the list of permissions?
  • Do we need to have a mechanism for pushing extra-critical updates?
  • Should we inform users about how big updates will be before downloading them?
    • For un-packaged apps we generally can't tell how big an update is going to be. We could implement mechanisms for doing estimates, but we don't have anything right now
    • For packaged apps we could implement such a mechanism, but it depends on the protocols we use (see below):
    • What protocol should we use for detecting that an app update is available and downloading the update? This is a question we need to hammer out with the store people and the AMO people who have a lot of experience with updates for addons. The last two solutions involve new server-side APIs to be defined, but could potentially be more efficient. Three possibilities which have been discussed:
      • Check if the HTTP Etag of the package has changed by sending a conditional HTTP request with a If-None-Match header. This is what the work-in-progresss implementation in bug 772364 is doing.
      • Group all the applications by store, and send to each store the list to check with ones to update. This could also return hashes for the new packages which could be safely downloaded from mirrors.
      • If the user has authentication credentials with a store, use a store specific api to get a list of updated applications.
  • Should we enable batch download of updates?
  • Should we indicate download+install progress to user?
  • Should we surface "this app has been updated" information to user?
  • Do we create user configuration options? eg:
    • Download and install apps in background.
  • How do we ensure backwards compatibility for apps that cannot update? eg: User is on Edge connection and rarely accesses via WiFi. Would their apps stop working once they are out of date?


mozApps API Changes

To support the previously described behaviour, we need a couple of additions to the content facing mozApps API, on the Application object:

  • Add a |readonly boolean removable| property.
  • Add a |DOMEventListener onupdated| event listener to be notified when an application has been updated. This let a dashboard update any displayed item that could have changed (icon, application name, etc.)

Open questions:

  • Do we also need an event signaling that an update is available?
  • Do we also need an event signaling that an update has been downloaded?