Buildbot/Talos
Talos
A python performance testing framework that is usable on Windows, Mac and Linux.
So, why Talos? Talos is the bronze automaton of Greek myth. Talos protected the island of Crete, throwing giant boulders at unwary seamen. He's also purported to have heated himself glowing hot and then embraced his enemies. Basically, he was awesome.
The Code
Talos lives in hg : http://hg.mozilla.org/build/talos
The pageloader bundle is in hg: hg.mozilla.org/build/pageloader
The Test Machines
All talos test are run on a pool of 2.26 Ghz Intel Core 2 Duo, 2Gb 1067 MHz DDR3 mac minis.
The machines are imaged to comply with the Test Reference Platforms.
Talos Tests
tp5
Tests the time it takes Firefox to load the tp5 web page test set. The web set was culled from the Alexa top 500 April 8th, 2011 and consists of 100 pages.
The set of pages is loaded sequentially 10 times.
Unfortunately, we do not distribute a copy of the set of test web pages.
Private Bytes
A memory metric tracked during tp4 test runs. This metric is sampled every 20 seconds.
For windows, a description from Microsoft TechNet.
RSS (Resident Set Size)
A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on linux/mac only.
Xres (X Resource Monitoring)
A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on linux only.
Working Set (tp5_memset)
A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on windows only. Description from Microsoft TechNet.
Modified Page List Bytes
A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on Windows7 only. Description from Microsoft MSDN.
% CPU
Cpu usage tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on windows only.
ts
Tests the startup time of Firefox by opening the browser 20 times.
The basic ts test uses a blank profile.
tdhtml
Tests which measure the time to cycle through a set of DHTML test pages.
The set of pages is loaded sequentially 5 times.
The list of pages is found here.
The test pages themselves are here.
tgfx
Measures the raw rendering speed of synthetic test results, designed to stress a particular aspect of gfx (e.g. transparency rendering or text rendering).
The set of pages is loaded sequentially 5 times.
The list of pages is found here.
The test pages themselves are here.
tsvg
An svg-only number that measures SVG rendering performance.
The set of pages is loaded sequentially 5 times.
The list of pages is found here.
The test pages themselves are here.
twinopen (txul)
Tests the amount of time it takes the open a new window.
tsspider
Runs the SunSpider benchmark test.
The set of pages is loaded sequentially 5 times.
The list of pages is found here.
The test pages themselves are here.
JSS/Domaeo Tests
Dromaeo suite of tests for JavaScript performance testing. See the [Dromaeo wiki for more information.
This suite is divided into several sub-suites.
Dromaeo CSS
The set of pages is loaded sequentially once.
The list of pages is found here.
The test pages themselves are here.
Dromaeo DOM
The set of pages is loaded sequentially once.
The list of pages is found here.
The test pages themselves are here.
Dromaeo JSLIB
The set of pages is loaded sequentially once.
The list of pages is found here.
The test pages themselves are here.
Dromaeo SunSpider
The set of pages is loaded sequentially once.
The list of pages is found here.
The test pages themselves are here.
Dromaeo Basics
The set of pages is loaded sequentially once.
The list of pages is found here.
The test pages themselves are here.
Dirty Tests
Uses a 'dirty' places.sqlite that more closely resembles that of an average user. The places.sqlite uses for testing are generated in three different sizes (small, medium, super-huge). The places.sqlite for the different runs are available by request to developers interested in testing with them.
Cold Tests
Measures cold startup times. These are achieved by purging caches before each browser opening of the standard ts test. This test is not available on windows.
Paint Tests
Paint tests are measuring the time to receive the MozAfterPaint instead of the first time javascript can execute code after launching.
Currently we run Paint tests (in addition to the original tests) for:
- ts (ts_paint)
- txul/twinopen (tpaint)
Paint tests were added in bug 612190. In the future we might add a Paint style test for tp bug 661918 and we might disable the original ts/txul tests bug 660124
Chrome vs. NoChrome Tests
All tests run through the pageloader extension can be run with or without browser chrome. The tests load the same pages as described above in either case. The majority or tests are run with browser chrome enabled.
The ability to run tests without the browser chrome opens up the ability to further isolate performance regressions.
Addon Testing
Talos currently runs a subset of tests against addons. For these tests the descriptions are the same as above (ts, tp) but the given addon is installed before test execution.
We attempt to have the correct preferences set for each individual addon to allow them to run correctly (skipping first-run pages, being enabled) and generate useful numbers. If preferences are missing please file a bug in [Testing/Talos].
All addon performance results are displayed on the [AddonTester waterfall] (including full log and any errors generated). The comparative results are found on the [Slow Performing Addons] mainpage.
How are the numbers calculated?
To ensure that the base profile is correctly installed for every test the browser is opened/closed once before test execution. This first cold open is excluded for the test result calculation.
For "cold" tests the caches are cleared after this initial open/close - in this way the browser is configured and ready but returned to a "cold" state.
All tests are run with newly installed profiles - profiles are not shared across test runs.
Pageload style tests (tp5, tdhtml, tgfx, etc)
The overall test number is determined by first calculating the median page load time for each page in the set (excluding the max page load per individual page). The max median from that set is then excluded and the average is taken; that becomes the number reported to the tinderbox waterfall.
Ts style tests (ts, twinopen, ts_cold, etc)
The overall test number is calculated by excluding the max opening time and taking an average of the remaining numbers.
Where are the numbers stored?
The results of every talos test are reported to the Perfomatic graph server.
Background Information
Naming convention
't' is pre-pended to the names to represent 'test'. Thus, ts = 'test startup', tp = 'test pageload', tdhtml = 'test dhtml'.
History of tp Tests
tp
The original tp test created by Mozilla to test browser page load time. Cycled through 40 pages. The pages were copied from the live web during November, 2000. Pages were cycled by loading them within the main browser window from a script that lived in content.
tp2/tp_js
The same tp test but loading the individual pages into a frame instead of the main browser window. Still used the old 40 page, year 2000 web page test set.
tp3
An update to both the page set and the method by which pages are cycled. The page set is now 393 pages from December, 2006. The pageloader is re-built as an extension that is pre-loaded into the browser chrome/components directories.
tp4
Updated web page test set to 100 pages from February 2009.
tp5
Updated web page test set to 100 pages from April 8th, 2011. Effort was made for the pages to no longer be splash screens/login pages/home pages but to be pages that better reflect the actual content of the site in question.
Let's Run The Tests!
Using the try server
If you have access to generate try builds you can also have performance tests run against your custom build following the guidelines here. The performance results will be generated on the same machines that generate the talos results for all check-ins on all branches.
Running locally
Talos, the pageloader, and a distributable web page test for tp has been packaged together into Standalone Talos. Following the given directions you will be able to run all the talos tests on your local machine. The results you collect will not be directly comparable to production talos runs on Mozilla machines, but by testing browser with/without whatever changes you are interested in you should be able to get an initial check on any performance regressions.
Bugs
Talos bugs are filed under Testing/Talos, such as requests for new tests or repairs to the talos code itself.
Graph server bugs are filed under Webtools/Graph server.
Talos machine maintenance bugs are filed under mozilla.org/Release Engineering, such as bugs having to do with the hardware that talos is run on or requests to run extra talos tests against a given build.