B2G/QA/Automation/UI/Strategy/Integration vs End to end

From MozillaWiki
< B2G‎ | QA‎ | Automation‎ | UI‎ | Strategy
Jump to navigation Jump to search

Objective

Separate concerns for Integration vs. Acceptance tests so that test processes and automation harnesses can optimize for their specific purposes and minimize risk and dependency.

Challenges Addressed

  • Development and QA have different approaches and needs from UI testing
  • Team has been blocked too much by cross-team dependencies

The Problem

To date, UI automation has been treated as a single type of testing, with the differences expressed largely as to whether it was running on TBPL or not, was written in JS or Python, who sheriffed, and so forth. And so much discussion, debate, and analysis paralysis has involved whether the automation should be run, written, or treated a particular way. It has been all too common to reach an impasse based on differing opinions.

But not all automation is the same just because it targets the UI, just as not all testing is the same just because it targets the SUT.

In particular, integration automation run before landing a commit is generally subject to a series of restrictions meant to ensure that the test results are returned quickly and in an unambiguous fashion. Those restrictions reduce the usefulness of the automation for use in acceptance testing, which must often accept a higher level of fragility and longer runtime to be more comprehensive.

When viewed through the lens of the other purpose, each type of automation is inferior:

Commit integration automation is incomplete and sloppy, trading expedience for coverage, and is generally written from a developer's perspective to ensure the behavior of what they did write--which might not actually meet the requirements of what they should have written. As such, it frequently falls short of the needs of acceptance.

Conversely, build acceptance automation is too slow, device-bound, and sometimes has spurious failures due to non-determinism inherent in the SUT. As the user scenarios it tests often bridge multiple parts of the system, it's very fragile with poor isolation. The need to rely on user-like behavior and check after every step slows it down. The necessary scope means it often can't be written adequately until entire subsystems are in place. So it's often wholly unsuitable for the incremental growth, quick land-or-not decisions, and debugging assistance that a good continuous integration suite provides.

However, these are weaknesses in perception. When used for their own specific purposes, each type of automation provides great value. Further, the compromises each makes are compensated for by the other.

The Solution

The solution is to separate the concerns. And since each type has a different primary stakeholder, ownership should be separated as well.

Two different suites of UI automation

  • Gaia Integration
    • Must be quick-running with no non-deterministic factors.
    • Results must be absolutely unambiguous.
    • Coverage is restricted because of these rules
    • Will be run prior to each commit and can prevent a bad code change from landing
    • Scope, detail, and maintenance is owned by functional teams as part of their code
    • Small isolated tests can be checked in with code changes immediately
    • Is sheriffed reliably and quickly through long-standing and established process
    • Runs on B2G Desktop currently, will run on device as well soon.
  • Gaia Acceptance
    • Can have longer-running tests and have reasonable amounts of fragility due to non-determinism
    • Results can be occasionally ambiguous as a trade off for higher coverage
    • Has much less restriction than Integration, so long as time spent triaging failures isn't extreme
    • Runs after each Tinderbox or candidate build to quickly find full-stack bugs
    • Scope, detail, and maintenance is owned by QA
    • Tests can be created after area is delivered and UI is stable enough to make cost acceptable
    • Must be sheriffed by QA, via a combination of alerts and result reviews
    • Primarily runs on-device to test full stack

Different Contexts

The main differences are:

  • Gaia Integration tests run faster and are generally more isolated.
  • Gaia Acceptance tests have more coverage due to less rules.
  • Gaia Integration tests can never fail unless there's a defective code change. They must be unambiguous.
  • Gaia Acceptance tests can still fail non-deterministically if they are reliable enough to be a net gain.
  • Gaia Integration tests are maintained by development to give themselves confidence in code changes.
  • Gaia Acceptance tests are maintained by QA to replace or extend manual acceptance testing.
  • Gaia Integration tests frequently test fragments of UI behavior and may be based on mocks or other low-level objects.
  • Gaia Acceptance tests are written as complete user-like scenarios, and operate and verify as the user would.
  • It is more important that Gaia Integration tests be solid than complete. Any incremental gain is valuable.
  • It is more important that Gaia Acceptance tests be complete than solid. All acceptance criteria must be tested to accept a build.

Ownership and Overlap

Ownership is separated because these differences can lead to variations in needs for breadth and depth of testing.

Tests can and should overlap between Gaia Integration and Gaia Acceptance. It is unacceptable process complication for QA to expect developers to always consult their needs and vice versa for every change. If the test flows are shared, it's all too easy for one group to inadvertently make a change that damages the purpose of the other. For acceptance, in particular, this is too risky, as proper acceptance depends heavily on maintaining a particular scope and depth.

Avoiding these communication issues and this risk requires separate tests, even to the point where there might be tests whose flow is entirely duplicated between the two suites.

While this seemingly violates "single point of truth," the different contexts in which the tests are specified, scoped and maintained actually makes these two different tests, much like two separate applications would never refer to each others' source unless it can be pushed into an independent library.

Via skillful reuse of View and other code modules between suites, each test suite can be treated as an independent target without increasing maintenance unacceptably. Ideally, only abstract fixture setup and flow is expressed in the test method, with all other maintainable aspects in shared modules. So long as both groups agree to maintain interfaces and promised behavior of shared module code, they can both work freely at any level of communication.

Of course, each group should have an opinion on coverage for either suite, and can (and should) help expand each suite, but single point of ownership allows decisions to be made quickly as appropriate for each set of primary stakeholders.

Timeline

The time is now.

Unlike other aspects of our strategy, this is a perspective shift and articulates a dual path towards the rest of our plans:

QA has defined and will own and expand Gaia Acceptance as an aid to its build acceptance mandate, and functional teams will continue to own and expand Gaia Integration as an aid to stabilization and their own processes. QA will help whenever resources are available.

Risks

None.