TestEngineering/Performance/Raptor: Difference between revisions

no edit summary
(Add high-value test information.)
No edit summary
Line 1: Line 1:
[[Image:Raptor.png|frameless|right]]
[[Image:Raptor.png|frameless|right]]


Raptor is a performance-testing framework for running browser pageload and browser benchmark tests. The core of Raptor was designed as a browser extension, therefore Raptor is cross-browser compatible and is currently running in production on Firefox Desktop, Firefox Android GeckoView, Fennec, Fenix, Reference Browser, Chromium, and Chrome.
Raptor is a performance-testing framework for running browser pageload and browser benchmark tests. Raptor is cross-browser compatible and is currently running in production on Firefox Desktop, Firefox Android GeckoView, Fenix, Reference Browser, Chromium, and Chrome.


* Contact: Dave Hunt [:davehunt]
* Contact: Dave Hunt [:davehunt]
Line 9: Line 9:
Raptor currently supports three test types: 1) page-load performance tests, 2) standard benchmark-performance tests, and 3) "scenario"-based tests, such as power, CPU, and memory-usage measurements on Android (and desktop?).
Raptor currently supports three test types: 1) page-load performance tests, 2) standard benchmark-performance tests, and 3) "scenario"-based tests, such as power, CPU, and memory-usage measurements on Android (and desktop?).


Locally, raptor can be invoked with either of the following commands - raptor-test may be deprecated in the future:
Locally, Raptor can be invoked with either of the following command:
 
     ./mach raptor
     ./mach raptor
    ./mach raptor-test


=== Page-Load Tests ===
=== Page-Load Tests ===
Line 40: Line 40:
* The Firefox android app is started up
* The Firefox android app is started up
* Post-startup browser settle pause of 30 seconds
* Post-startup browser settle pause of 30 seconds
* On Fennec only, a new browser tab is created (other Firefox apps use the single/existing tab)
* The test URL is loaded; measurements taken
* The test URL is loaded; measurements taken
* The tab is reloaded 14 more times; measurements taken each time
* The tab is reloaded 14 more times; measurements taken each time
Line 67: Line 66:
* The Firefox Android app is started up
* The Firefox Android app is started up
* Post-startup browser settle pause of 30 seconds
* Post-startup browser settle pause of 30 seconds
* On Fennec only, a new browser tab is created (other Firefox apps use the single/existing tab)
* The test URL is loaded; measurements taken
* The test URL is loaded; measurements taken
* The Android app is shut down
* The Android app is shut down
Line 75: Line 73:


==== Using Live Sites ====
==== Using Live Sites ====
It is possible to use live web pages for the page-load tests instead of using the mitproxy recordings. This option is available when running on Try only; as we don't want to submit data from live pages to Perfherder (since live page content will always be changing).
It is possible to use live web pages for the page-load tests instead of using the mitproxy recordings.


To run a particular Raptor tp6 page-load test with live sites, open the raptor-tp6*.ini file ([https://searchfox.org/mozilla-central/source/testing/raptor/raptor/tests Raptor tests folder]), and for the test default (or under a single page/subtest) just add this attribute:
To run a particular Raptor tp6 page-load test with live sites, open the raptor-tp6*.ini file ([https://searchfox.org/mozilla-central/source/testing/raptor/raptor/tests Raptor tests folder]), and for the test default (or under a single page/subtest) just add this attribute:
Line 86: Line 84:
It is possible to disable alerting for all our performance tests. Open the target test manifest such as the raptor-tp6*.ini file ([https://searchfox.org/mozilla-central/source/testing/raptor/raptor/tests Raptor tests folder]), and make sure there are no <tt>alert_on</tt> specifications.
It is possible to disable alerting for all our performance tests. Open the target test manifest such as the raptor-tp6*.ini file ([https://searchfox.org/mozilla-central/source/testing/raptor/raptor/tests Raptor tests folder]), and make sure there are no <tt>alert_on</tt> specifications.


When it's removed there will no longer be a <tt>shouldAlert</tt> field in the output perferhder data (you can find the [https://searchfox.org/mozilla-central/source/testing/mozharness/external_tools/performance-artifact-schema.json#68,165 schema here]). As long as <tt>shouldAlert</tt> is not in the data, no alerts will be generated. If you need to also disable code sheriffing for the test, then you need to change the tier of the task to 3.
When it's removed there will no longer be a <tt>shouldAlert</tt> field in the output Perfherder data (you can find the [https://searchfox.org/mozilla-central/source/testing/mozharness/external_tools/performance-artifact-schema.json#68,165 schema here]). As long as <tt>shouldAlert</tt> is not in the data, no alerts will be generated. If you need to also disable code sheriffing for the test, then you need to change the tier of the task to 3.


==== High value tests ====
==== High value tests ====
Line 129: Line 127:
For a combined-measurement run with distinct Perfherder output for each measurement type, you can do:
For a combined-measurement run with distinct Perfherder output for each measurement type, you can do:


   ./mach raptor-test --test raptor-scn-power-idle-bg-fenix --app fenix --binary org.mozilla.fenix.performancetest --host 10.0.0.16 --power-test --memory-test --cpu-test
   ./mach raptor --test raptor-scn-power-idle-bg-fenix --app fenix --binary org.mozilla.fenix.performancetest --host 10.0.0.16 --power-test --memory-test --cpu-test


Each measurement subtype (power-, memory-, and cpu-usage) will have a corresponding PERFHERDER_DATA blob:
Each measurement subtype (power-, memory-, and cpu-usage) will have a corresponding PERFHERDER_DATA blob:
Line 152: Line 150:
* We set `scenario_time` to '''20 minutes''' (1200000 milliseconds), and `page_timeout` to '22 minutes' (1320000 milliseconds)
* We set `scenario_time` to '''20 minutes''' (1200000 milliseconds), and `page_timeout` to '22 minutes' (1320000 milliseconds)
** It's crucial that `page_timeout` exceed `scenario_time`; if not, measurement tests will fail/bail early
** It's crucial that `page_timeout` exceed `scenario_time`; if not, measurement tests will fail/bail early
* We launch the {Fenix, Fennec, GeckoView, Reference Browser} on-Android app
* We launch the {Fenix, GeckoView, Reference Browser} on-Android app
* Post-startup browser settle pause of 30 seconds
* Post-startup browser settle pause of 30 seconds
* On Fennec only, a new browser tab is created (other Firefox apps use the single/existing tab)
* On Fennec only, a new browser tab is created (other Firefox apps use the single/existing tab)
Line 159: Line 157:
* Raw power-use measurement data is listed in the perfherder-data.json/raptor.json artifacts
* Raw power-use measurement data is listed in the perfherder-data.json/raptor.json artifacts


In the Perfherder (or Firefox Health) dashboards for these power usage tests, all data points have milli-Ampere-hour units, with a lower value being better.
In the Perfherder dashboards for these power usage tests, all data points have milli-Ampere-hour units, with a lower value being better.
Proportional power usage is the total power usage of hidden battery sippers that is proportionally "smeared"/distributed across all open applications.
Proportional power usage is the total power usage of hidden battery sippers that is proportionally "smeared"/distributed across all open applications.


Line 165: Line 163:


To run on a tethered phone via USB from a macOS host, on:
To run on a tethered phone via USB from a macOS host, on:
===== Fennec =====
  ./mach raptor --test raptor-scn-power-idle-fennec --app fennec --binary org.mozilla.firefox --power-test --host 10.252.27.96


===== Fenix =====
===== Fenix =====
Line 192: Line 186:
* pgo builds tend to be -- from my limited empirical evidence -- about 10 - 15 minutes longer to complete than their opt counterparts
* pgo builds tend to be -- from my limited empirical evidence -- about 10 - 15 minutes longer to complete than their opt counterparts


==== Perf Dashboards ====
==== Dashboards ====


* Perfherder example (GeckoView): https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=mozilla-central,2027286,1,10&series=mozilla-central,2027291,1,10&series=mozilla-central,2027296,1,10
See [https://wiki.mozilla.org/TestEngineering/Performance/Results performance results] for our various dashboards.
* [https://github.com/mozilla-frontend-infra/firefox-health-dashboard/issues/420 Coming soon] to https://health.graphics/android


=== Running Locally ===
=== Running Locally ===
Line 251: Line 244:


* Device is in 'superuser' mode
* Device is in 'superuser' mode
** [stephend] - I want to explain this a bit more, so leaving this comment as a reminder


* The geckoview example app is already installed on the device (from ./mach bootstrap, above). Download the geckoview_example.apk from the appropriate [https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=android%2Cbuild android build on treeherder], then install it on your device, i.e.:
* The geckoview example app is already installed on the device (from ./mach bootstrap, above). Download the geckoview_example.apk from the appropriate [https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=android%2Cbuild android build on treeherder], then install it on your device, i.e.:
Line 325: Line 317:


With that setting, Raptor will not start the playback tool (i.e. Mitmproxy) and will not turn on the corresponding browser proxy, therefore forcing the test page to load live.
With that setting, Raptor will not start the playback tool (i.e. Mitmproxy) and will not turn on the corresponding browser proxy, therefore forcing the test page to load live.
When `use_live_pages = true` and a page-load test is measuring hero element (set in the test INI 'measure' option) then the hero element measurement will automatically be dropped - because the hero elements only exist in our Mitmproxy recordings and not in live pages.
The word 'live' will be appended to the test name in the PERFHERDER_DATA so live sites can be specifically seen in perfherder for try runs.
'''Important:''' This is fine for running on try, but we don't want to enable live sites in the production repos - because we don't want live site data being ingested by perfherder and used for regression alerting etc. Therefore as a safety catch, if using live sites the test won't even run unless running locally or on try.


=== Running Raptor on Try ===
=== Running Raptor on Try ===
Line 352: Line 338:
==== Raptor Hardware in Production ====
==== Raptor Hardware in Production ====


The Raptor performance tests run on dedicated hardware (the same hardware that the Talos performance tests use). See the [[https://wiki.mozilla.org/Performance_sheriffing/Talos/Misc#Hardware_Profile_of_machines_used_in_automation|Talos hardware used in automation wiki page]] for more details.
The Raptor performance tests run on dedicated hardware (the same hardware that the Talos performance tests use). See the [[/TestEngineering/Performance/Platforms|performance platforms]] for more details.
 
==== Running Fennec ESR 68 tests ====
 
Fennec 68 tests are setup to run on latest fennec esr 68 build.
 
To start a try run on Fennec ESR 68 run:
 
    $ ./mach try fuzzy -q="fennec68" --full


=== Profiling Raptor Jobs ===
=== Profiling Raptor Jobs ===
Line 745: Line 723:
! Project !! Repository !! Tests results !! Schedule
! Project !! Repository !! Tests results !! Schedule
|-
|-
| Fenix (aka Firefox Preview) || [https://github.com/mozilla-mobile/fenix/ Github] || [https://treeherder.mozilla.org/#/jobs?repo=fenix Treeherder view] || Every 24 hours [https://tools.taskcluster.net/hooks/project-releng/cron-task-mozilla-mobile-fenix%2Fraptor Taskcluster force hook]
| Fenix || [https://github.com/mozilla-mobile/fenix/ Github] || [https://treeherder.mozilla.org/#/jobs?repo=fenix Treeherder view] || Every 24 hours [https://tools.taskcluster.net/hooks/project-releng/cron-task-mozilla-mobile-fenix%2Fraptor Taskcluster force hook]
|-
|-
| Reference-Browser || [https://github.com/mozilla-mobile/reference-browser/ Github] || [https://treeherder.mozilla.org/#/jobs?repo=reference-browser Treeherder view] || On each push
| Reference-Browser || [https://github.com/mozilla-mobile/reference-browser/ Github] || [https://treeherder.mozilla.org/#/jobs?repo=reference-browser Treeherder view] || On each push
Line 764: Line 742:
* https://firefox-source-docs.mozilla.org/tools/lint/coding-style/coding_style_python.html
* https://firefox-source-docs.mozilla.org/tools/lint/coding-style/coding_style_python.html


[https://github.com/psf/black/ black]is the tool used to reformat the Python code.
[https://github.com/psf/black/ black] is the tool used to reformat the Python code.
Confirmed users
2,177

edits