TestEngineering/Performance/Sheriffing/Alerts: Difference between revisions
< TestEngineering | Performance | Sheriffing
Jump to navigation
Jump to search
(platform_microbench description) |
(more Autophone details) |
||
Line 17: | Line 17: | ||
=== build_metrics === | === build_metrics === | ||
* short description: Monitor build times on multiple platforms, the size of the installers and other compiler-specific insights. | * short description: Monitor build times on multiple platforms, the size of the installers and other compiler-specific insights. | ||
* frequency: | * frequency: every 1-2 days, around 5 alerts | ||
* contact: :froydnj, :ted.mielczarek, :gps | * contact: :froydnj, :ted.mielczarek, :gps | ||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_coalescing coalesced] by SETA?: no ( | * [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_coalescing coalesced] by SETA?: no (shouldn't require backfilling) | ||
* available on platforms: | * available on platforms: | ||
** Windows: 32/64bit (OPT, No-OPT, Mingw builds) | ** Windows: 32/64bit (OPT, No-OPT, Mingw builds) | ||
Line 31: | Line 31: | ||
** build times often spike upwards for just a short time; they then lower to previous levels thanks to caching mechanisms set in place. We mark these as invalid alerts. | ** build times often spike upwards for just a short time; they then lower to previous levels thanks to caching mechanisms set in place. We mark these as invalid alerts. | ||
=== Autophone === | === Autophone === | ||
* short description: <provide it> | |||
* frequency: every week or so, around 4 alerts | |||
* contact: <mention Bob Clary> | |||
* [https://wiki.mozilla.org/Buildbot/Talos/Sheriffing/Tree_FAQ#What_is_coalescing coalesced] by SETA?: no | |||
* available on platforms: | |||
** Android: 4.2, 4.4, 6.0, 7.1 | |||
* triaging specifics: | |||
** when investigating, one should look for Android related changes | |||
** many of these tests are pretty noisy; often, they turn out to be invalid (one reason is devices overheat, which affects tests) | |||
** should consider to needinfo? Bob Clary, to check status of suspect phone devices | |||
** tricky to investigate; should use Phonedash, as a more precise investigation tool | |||
** retriggers are almost always needed, but results show up after a day or so | |||
=== AWSY === | === AWSY === | ||
* short description: <provide it> [https://wiki.mozilla.org/AWSY/Tests link] | * short description: <provide it> [https://wiki.mozilla.org/AWSY/Tests link] |
Revision as of 09:18, 16 January 2018
Perfherder alerts
General triage process
<how triaging is similar among all alerts>
Types of alerts
Talos
- short description: <provide it> link
- frequency: daily, little more than a dozen alerts
- coalesced by SETA?: yes (often requires backfilling)
- available on platforms:
- Windows: 7 32bit, 10 64bit (OPT, PGO builds)
- Linux: 64bit (OPT, PGO builds)
- OS X: 10.10 (OPT builds only)
- triaging specifics:
build_metrics
- short description: Monitor build times on multiple platforms, the size of the installers and other compiler-specific insights.
- frequency: every 1-2 days, around 5 alerts
- contact: :froydnj, :ted.mielczarek, :gps
- coalesced by SETA?: no (shouldn't require backfilling)
- available on platforms:
- Windows: 32/64bit (OPT, No-OPT, Mingw builds)
- Linux: 32/64bit (OPT, No-OPT builds)
- OS X: 10.10 (cross, no-cross builds)
- Android: 4.0, 4.2, 5.0
- triaging specifics:
- often easy to investigate
- most alerts aren't noisy
- when investigating, one should look for build config changes <ask :gps to provide more data>
- build times often spike upwards for just a short time; they then lower to previous levels thanks to caching mechanisms set in place. We mark these as invalid alerts.
Autophone
- short description: <provide it>
- frequency: every week or so, around 4 alerts
- contact: <mention Bob Clary>
- coalesced by SETA?: no
- available on platforms:
- Android: 4.2, 4.4, 6.0, 7.1
- triaging specifics:
- when investigating, one should look for Android related changes
- many of these tests are pretty noisy; often, they turn out to be invalid (one reason is devices overheat, which affects tests)
- should consider to needinfo? Bob Clary, to check status of suspect phone devices
- tricky to investigate; should use Phonedash, as a more precise investigation tool
- retriggers are almost always needed, but results show up after a day or so
AWSY
- short description: <provide it> link
- frequency: every 2-3 days, around half a dozen
- contact: :erahm
- coalesced by SETA?: yes
- available on platforms:
- Windows: 32/64bit (OPT, PGO builds)
- Linux: 64bit (OPT builds)
- OS X: 10.10 (OPT builds)
- Android: 4.2, 4.3 (OPT builds)
- triaging specifics:
- retriggering/backfilling takes some time (>1h per test), so one must not abuse in collecting missing graph data
platform_microbench
- short description: <provide it>
- frequency: daily, around 1-2 dozen alerts
- contact:
- coalesced by SETA?: yes
- available on platforms:
- Linux: 32bit (OPT builds), 64bit (OPT, PGO, ASAN builds)
- Windows: 7 32bit, 10 64bit (OPT builds)
- OS X: 10.10 (OPT builds)
- triaging specifics:
- happen very often; unless triaged, they quickly pile up
- very noisy alerts; often many of the alerts turn out to be invalid
- cheap to retrigger, as each test takes <20min to finish; still, one should not abuse this