B2G/QA/Automation/UI/Strategy/Integration vs End to end: Difference between revisions
Line 30: | Line 30: | ||
== Two different suites of UI automation == | == Two different suites of UI automation == | ||
* | * Gaia Integration | ||
** Must be quick-running with no non-deterministic factors. | ** Must be quick-running with no non-deterministic factors. | ||
** Results must be absolutely unambiguous. | ** Results must be absolutely unambiguous. | ||
Line 40: | Line 40: | ||
** Runs on B2G Desktop currently, will run on device as well soon. | ** Runs on B2G Desktop currently, will run on device as well soon. | ||
* | * Gaia UAT | ||
** Can have longer-running tests and have reasonable amounts of fragility due to non-determinism | ** Can have longer-running tests and have reasonable amounts of fragility due to non-determinism | ||
** Results can be occasionally ambiguous as a trade off for higher coverage | ** Results can be occasionally ambiguous as a trade off for higher coverage | ||
Line 54: | Line 54: | ||
The main differences are: | The main differences are: | ||
* | * Gaia Integration tests run faster and are generally more isolated. | ||
* UATs have more coverage | * Gaia UATs have more coverage due to less rules. | ||
* | * Gaia Integration tests can never fail unless there's a defective code change. They must be unambiguous. | ||
* UATs can still fail non-deterministically if they are reliable enough to be a net gain. | * Gaia UATs can still fail non-deterministically if they are reliable enough to be a net gain. | ||
* | * Gaia Integration tests are maintained by development to give themselves confidence in code changes. | ||
* UATs are maintained by QA to replace or extend manual regression testing. | * Gaia UATs are maintained by QA to replace or extend manual regression testing. | ||
* | * Gaia Integration tests frequently test fragments of UI behavior and may be based on mocks or other low-level objects. | ||
* UATs are always written as complete user-like scenarios, and generally operate and verify as the user would. | * Gaia UATs are always written as complete user-like scenarios, and generally operate and verify as the user would. | ||
* It is more important that | * It is more important that Gaia Integration tests be solid than that they're complete. Any incremental gain is valuable. | ||
* It is more important that UATs be complete than solid. All acceptance criteria must be tested to accept a build. | * It is more important that Gaia UATs be complete than solid. All acceptance criteria must be tested to accept a build. | ||
== Ownership and Overlap == | == Ownership and Overlap == | ||
Line 73: | Line 73: | ||
Ownership is separated because these differences can lead to variant needs in terms of breadth and depth of verification. | Ownership is separated because these differences can lead to variant needs in terms of breadth and depth of verification. | ||
Tests can and should overlap between | Tests can and should overlap between Gaia Integration and Gaia UAT. It is unacceptable process complication for QA to expect developers to always consult their needs and vice versa for every change. If the suites are united, it's all too easy for one group to inadvertently make a change that damages the purpose of the other. For acceptance, in particular, this is too risky. | ||
Avoiding this requires separate tests, even to the point where there might be tests whose code is entirely duplicated between the two suites. | Avoiding this requires separate tests, even to the point where there might be tests whose code is entirely duplicated between the two suites. |
Revision as of 02:27, 12 February 2015
Objective
Separate concerns for Integration vs. User Acceptance Testing (UAT) so that test processes and automation harnesses can optimize for their specific purposes and minimize risk and dependency.
Challenges Addressed
- Development and QA have different approaches and needs from UI testing
- Team has been blocked too much by cross-team dependencies
The Problem
To date, UI automation has been treated as a single type of testing, with the differences expressed largely as to whether it was running on TBPL or not, was written in JS or Python, who sheriffed, and so forth. And so much discussion, debate, and analysis paralysis has involved whether the automation should be run, written, or treated a particular way. It has been all too common to reach an impasse based on differing opinions.
But not all automation is the same just because it targets the UI, just as not all testing is the same just because it targets the SUT.
In particular, automation run under continuous integration is generally subject to a series of restrictions meant to ensure that the test results are returned quickly and in an unambiguous fashion. Those restrictions reduce the usefulness of the automation for use in acceptance testing, which must accept a higher level of fragility to be more comprehensive.
When viewed through the lens of the other purpose, each type of automation is inferior:
Continuous integration automation is incomplete and sloppy, trading expedience for coverage, and is generally written from a developer's perspective to verify what they did write--which might not actually meet the requirements of what they should have written. As such, it frequently falls short of the needs of acceptance.
And acceptance automation is too slow, device-bound, and sometimes has spurious failures due to non-determinism inherent in the SUT. Since the user scenarios it tests often bridge multiple parts of the system, it's very fragile with poor isolation. The scope also means it often can't be written adequately until entire subsystems are in place. So it's often wholly unsuitable for the incremental growth, quick land-or-not decisions, and debugging assistance that a good continuous integration suite provides.
However, these are weaknesses in perception. When used for their own specific purposes, each type of automation provides great value. Further, the compromises each makes are compensated for by the other.
The Solution
The solution is to separate the concerns. And since each type has a different primary stakeholder, ownership should be separated as well.
Two different suites of UI automation
- Gaia Integration
- Must be quick-running with no non-deterministic factors.
- Results must be absolutely unambiguous.
- Coverage is restricted because of these rules
- Will be run prior to each commit and can prevent a bad code change from landing
- Scope, detail, and maintenance is owned by functional teams as part of their code
- Small isolated tests can be checked in with code changes immediately
- Is sheriffed reliably and quickly through long-standing and established process
- Runs on B2G Desktop currently, will run on device as well soon.
- Gaia UAT
- Can have longer-running tests and have reasonable amounts of fragility due to non-determinism
- Results can be occasionally ambiguous as a trade off for higher coverage
- Has much fewer restrictions than CI, so long as time spent triaging failures isn't extreme
- Runs after each Tinderbox build and find post-integration bugs quickly
- Scope, detail, and maintenance is owned by QA
- Tests can be created after area is delivered and UI is stable enough to make cost acceptable
- Must be sheriffed by QA, via a combination of alerts and result reviews
- Primarily runs on-device to test full stack
Different Contexts
The main differences are:
- Gaia Integration tests run faster and are generally more isolated.
- Gaia UATs have more coverage due to less rules.
- Gaia Integration tests can never fail unless there's a defective code change. They must be unambiguous.
- Gaia UATs can still fail non-deterministically if they are reliable enough to be a net gain.
- Gaia Integration tests are maintained by development to give themselves confidence in code changes.
- Gaia UATs are maintained by QA to replace or extend manual regression testing.
- Gaia Integration tests frequently test fragments of UI behavior and may be based on mocks or other low-level objects.
- Gaia UATs are always written as complete user-like scenarios, and generally operate and verify as the user would.
- It is more important that Gaia Integration tests be solid than that they're complete. Any incremental gain is valuable.
- It is more important that Gaia UATs be complete than solid. All acceptance criteria must be tested to accept a build.
Ownership and Overlap
Ownership is separated because these differences can lead to variant needs in terms of breadth and depth of verification.
Tests can and should overlap between Gaia Integration and Gaia UAT. It is unacceptable process complication for QA to expect developers to always consult their needs and vice versa for every change. If the suites are united, it's all too easy for one group to inadvertently make a change that damages the purpose of the other. For acceptance, in particular, this is too risky.
Avoiding this requires separate tests, even to the point where there might be tests whose code is entirely duplicated between the two suites.
While this seemingly violates "single point of truth," the different contexts in which the tests are specified, scoped and maintained actually makes these two different tests, much like two separate applications would never refer to each others' source unless it can be pushed into an independent library. Via skillful reuse of View and other generic code modules between suites, each test can be treated as an independent target without increasing maintenance unacceptably.
Of course, each group should have an opinion on coverage for either suite, and can (and should) help expand each suite, but single point of ownership allows decisions to be made quickly as appropriate for each set of primary stakeholders.
Timeline
The time is now.
Unlike other aspects of our strategy, this is a perspective change and articulates a dual path towards the rest of our plans:
QA has defined and will own and expand UATs as an aid to its acceptance mandate, and functional teams will continue to own and expand CI as an aid to their own processes. QA will help whenever resources are available.
Risks
None.