Buildbot/Talos

Talos

Talos is a python performance testing framework that is usable on Windows, Mac and Linux. Talos is our versatile performance testing framework we use at Mozilla. It was created to serve as a test runner for the existing performance tests that Mozilla was running back in 2007 as well as providing an extensible framework for new tests as they were created.

So, why Talos? Talos is the bronze automaton of Greek myth. Talos protected the island of Crete, throwing giant boulders at unwary seamen. He's also purported to have heated himself glowing hot and then embraced his enemies. Basically, he was awesome.

The Code

Talos lives in hg : http://hg.mozilla.org/build/talos

The pageloader extension lives in talos's repository: hg.mozilla.org/build/talos/talos/pageloader

The Test Machines

All talos test are run on a pool of 2.26 Ghz Intel Core 2 Duo, 2Gb 1067 MHz DDR3 mac minis.

The machines are imaged to comply with the Test Reference Platforms.

Order of Operations

For each test a new profile is installed in the browser (either an empty base profile or a profile with an existing places.sqlite in case of the dirty tests). Profiles are not shared across test runs. To initialize the profile an initial open/close is done to the browser. This initial open/close is not included in the test results and is only for configuration purposes.

Regressions

To determine whether a good point is "good" or "bad", we take 20-30 points of historical data, and 5 points of future data. We compare these using a t-test. See https://wiki.mozilla.org/images/c/c0/Larres-thesis.pdf#page=74 . Regressions are mailed to the dev-tree-management mailing list. Regressions are calculated by the analyze_talos.py script which uses a configuration file based on http://hg.mozilla.org/graphs/file/tip/server/analysis/analysis.cfg.template

Talos Tests

N = _nochrome; P = _paint	TBPL Abbreviation (TestName)
TestName	Trunk	Aurora	Beta	Release	ESR
[tp5]	tp (tp5n)	tp (tp5n)	tp (tp5n)	tp (tp5row)	tp (tp5)
[tsvg]	s (tsvgr)	s (tsvgr)	s (tsvgr)	s (tsvg)	s (tsvg)
[tsvg_opacity]	s (tsvgr_opacity)	s (tsvgr_opacity)	s (tsvgr_opacity)	s (tsvg_opacity)	s (tsvg_opacity)
[tdhtml]	deactivated	deactivated	deactivated	deactivated	deactivated
[tdhtml_nochrome]	deactivated	deactivated	deactivated	deactivated	deactivated
[a11y]	o (a11yr P)	o (allyr P)	o (allyr P)	c (ally.2 P)	c (ally P)
[ts_paint]	o (ts_paint)	o (ts_paint)	o (ts_paint)	c (ts_paint)	c (ts_paint)
[tpaint] (aka twinopen/txul)	o (tpaint)	o (tpaint)	o (tpaint)	c (tpaint)	c (tpaint)
[dromaeo_css]	d	d	d	dr	dr
[dromaeo_dom]	d	d	d	dr	dr
[tsspider]	deactivated	deactivated	deactivated	c (tsspider.2 P)	c (tsspider P)
[tsspider_nochrome]	deactivated	deactivated	deactivated	n (tsspider.2 NP)	n (tsspider NP)
[xperf]	x (windows only)
[ts_places_generated_med]	p (P)	p (P)	p (P)	di	di
[ts_places_generated_max]	p (P)	p (P)	p (P)	di	di
[tscroll]	o (tscrollr)	o (tscrollr)	o (tscrollr)	c (tscroll.2)	c (tscroll)
[tresize]	c (tresize)	c (tresize)	c (tresize)
[sunspider 0.9.1]	d (sunspider)	d (sunspider)	d (sunspider)
[kraken]	d (kraken)	d (kraken)	d (kraken)
[v8 (version 7)]	d (v8_7)	d (v8_7)	d (v8_7)

Where to get this information:

Talos tests are defined in http://hg.mozilla.org/build/talos/file/tip/talos/test.py
TBPL abbreviations are defined in http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/file/tip/js/Config.js#l302
Perf-o-matic names are defined in http://hg.mozilla.org/graphs/file/tip/sql/data.sql
Talos suites are configured for production in http://hg.mozilla.org/build/buildbot-configs/file/tip/mozilla-tests/config.py; these names are mapped to TBPL via regexes: http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/file/e2e344885c80/js/Data.js#l512

A table detailing information flow from buildbot to talos to TBPL and graphserver is available at http://k0s.org:8080/ (an outdated static copy lives at http://k0s.org/mozilla/talos/talosnames.html ) . This is generated with the talosnames script, as detailed in http://k0s.org/mozilla/blog/20120724135349 . See also bug 770460.

tp5

contact: :jhammel, :jmaher
source: not available
type: PageLoader

Talos test name	Graphserver name	Description
tp5r	Tp5r MozAfterPaint	tp5 with responsiveness
tp5row	Tp5 Row Major MozAfterPaint	tp5r running in Row Major with 25 cycles/page, ignoring the first 5
tp5n	Tp5 No Network Row Major MozAfterPaint	tp5row with a new tp5.zip that has no 404s and no external network access
tp5	Tp5 MozAfterPaint	Measures the time to load a webpage and receive [both a MozAfterPaint and OnLoad event].

Tests the time it takes Firefox to load the tp5 web page test set. The web set was culled from the Alexa top 500 April 8th, 2011 and consists of 100 pages.

Unfortunately, we do not distribute a copy of the set of test web pages as these would not constitute fair use. Here are the broad steps we use to create the test set:

Take the Alexa top 500 sites list
Remove all sites with questionable or explicit content
Remove duplicate site (for ex. many Google search front pages)
Manually select to keep interesting pages (such as pages in different locales)
Select a more representative page from any site presenting a simple search/login/etc. page
Deal with Windows 255 char limit for cached pages
Limit test set to top 100 pages

Note that the above steps did not eliminate all outside network access so we had to take further action to scrub all the pages so that there are 0 outside network accesses (this is done so that the tp test is as deterministic measurement of our rendering/layout/paint process as possible). If you are on the Mozilla intranet, you can obtain the current page set for local testing. DO NOT DISTRIBUTE IT.

Private Bytes

A memory metric tracked during tp4 test runs. This metric is sampled every 20 seconds.

For windows, a description from Microsoft TechNet.

RSS (Resident Set Size)

A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on linux/mac only.

Description from wikipedia.

Xres (X Resource Monitoring)

A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on linux only.

xres man page.

Working Set (tp5_memset)

A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on windows only. Description from Microsoft TechNet.

Modified Page List Bytes

A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on Windows7 only. Description from Microsoft MSDN.

% CPU

Cpu usage tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on windows only.

Responsiveness

Reports the delay in milliseconds for the event loop to process a tracer event. For more details, see bug 631571.

ts_paint

contact: :mak, :jimm, :jhammel, :jmaher
source: tspaint_test.html
Perfomatic: "Ts, Paint"
type: Startup

Launches tspaint_test.html with the current timestamp in the url, waits for [MozAfterPaint and onLoad] to fire, then records the end time and calculates the time to startup.

The basic ts test uses a blank profile. Formerly known as ts before we looked for the MozAfterPaint event.

ts_places_generated_med

contact: :mak, :jhammel, :jmaher
source: tspaint_test.html
type: Startup
dirty: this is also referred to as the dirty test

Runs the same test as ts_paint, but uses a generated profile to simulate what an average user would have. This profile is very outdated and needs to be updated.

ts_places_generated_max

contact: :mak, :jhammel, :jmaher
source: tspaint_test.html
type: Startup
dirty: this is also referred to as the dirty test

Runs the same test as ts_paint, but uses a generated profile to simulate what a power user would have. This profile is very outdated and needs to be updated.

tdhtml

contact: :peterv, :jhammel, :jmaher
source: dhtml.manifest
type: PageLoader

Talos test name	Graphserver name	Description
tdhtmlr	DHTML Row Major	Row based and 25 cycles/page.
tdhtml.2	DHTML 2	Ignoring the first value instead of the highest (usually the highest is the first)

Tests which measure the time to cycle through a set of DHTML test pages. This test will be updated in the near future.

This test is also ran with the nochrome option.

tsvg

contact: :jwatt, :jhammel, :jmaher
source: svg.manifest
type: PageLoader

Talos test name	Graphserver name	Description
svgr	SVG Row Major	Row Major and 25 cycles/page.
svg	SVG	Column based and 5 cycles.

An svg-only number that measures SVG rendering performance.

tsvg-opacity

contact: :jwatt, :jhammel, :jmaher
source: svg.manifest
type: PageLoader

Talos test name	Graphserver name	Description
svgr_opacity	SVG, Opacity Row Major	Row Major and 25 cycles/page.
svg_opacity	SVG, Opacity	Column based and 5 cycles.

An svg-only number that measures SVG rendering performance.

tpaint

contact: :jimm, :jhammel, :jmaher
source: [tpaint-window.html]
type: Startup

Talos test name	Graphserver name	Description
tpaint	Paint	twinopen but measuring the time after we receive the [MozAfterPaint and OnLoad event].
twinopen		original test to measure the time to open window based on OnLoad event.
txul	Txul	another name for twinopen. Also we report txul in the regression emails.

Tests the amount of time it takes the open a new window. This test does not include startup time. Multiple test windows are opened in succession, results reported are the average amount of time required to create and display a window in the running instance of the browser. (Measures ctrl-n performance.)

sunspider

contact: :dmandelin, :jhammel, :jmaher
source: [sunspider.manifest]
type: PageLoader
measuring: ???
reporting: ???

Talos test name	Graphserver name	Description
sunspider	SunSpider 0.9.1	updated sunspider version 0.9.1 test suite.
tsspider.2	SunSpider 2	same as tsspider, but ignore the first value instead of the largest.
tsspider	SunSpider	original sunspider test (version unknown), run 5 cycles

This is the sunspider javascript benchmark taken verbatim and slightly modified to fit into our pageloader extension and talos harness.

The previous version of this test is tsspider which is deprecated now.

JSS/Domaeo Tests

Dromaeo suite of tests for JavaScript performance testing. See the Dromaeo wiki for more information.

This suite is divided into several sub-suites.

Dromaeo CSS

contact: :dmandelin, :jhammel, :jmaher source: [css.manifest] type: PageLoader

Each page in the manifest is part of the dromaemo css benchmark.

Dromaeo DOM

contact: :dmandelin, :jhammel, :jmaher source: [dom.manifest] type: PageLoader

Each page in the manifest is part of the dromaemo css benchmark.

Dromaeo JSLIB

The set of pages is loaded sequentially once.

The list of pages is found here.

The test pages themselves are here.

Deprecated as of June 30th, 2012

Dromaeo SunSpider

The set of pages is loaded sequentially once.

The list of pages is found here.

The test pages themselves are here.

Deprecated as of June 30th, 2012

Dromaeo Basics

The set of pages is loaded sequentially once.

The list of pages is found here.

The test pages themselves are here.

Deprecated as of June 30th, 2012

Dromaeo V8

The set of pages is loaded sequentially once.

The list of pages is found here.

The test pages themselves are here.

Deprecated as of June 30th, 2012

Trace Malloc

This test uses the trace-malloc tool from tools/trace-malloc to wrap calls to malloc and log information about every memory allocation.

a11y

contact: :davidb, :tbsaunde, :jhammel, :jmaher
source: [a11y.manifest]
type: PageLoader
measuring: ???
reporting: ???

Talos test name	Graphserver name	Description
a11yr	a11y Row Major MozAfterPaint	Row Major testing with 25 cycles per page
a11y.2	a11y 2 MozAfterPaint	same as a11y ignoring the first value collected instead of the largest
a11y	a11y MozAfterPaint	iterate through each page, 5 cycles through the list, ignore the highest value from each page

This test ensures basic a11y tables and permutations do not cause performance regressions.

tscroll

contact: :jrmuizel, :jhammel, :jmaher
source: [scroll.manifest]
type: PageLoader
measuring: ???
reporting: ???

Talos test name	Graphserver name	Description
tscrollr	tscroll Row Major	Row Major testing with 25 cycles
tscroll.2	tscroll 2	Ignore the first value for each page instead of the largest
tscroll	tscroll	run through each page in the manifest and cycle 5 times. For each page, ignore the largest value

This test does some scrolly thing

tresize

contact: :jimm, :jmaher
source: [tresize-test.html]
type: StartupTest
measuring: Time to do XUL resize
reporting: ???

Talos test name	Graphserver name	Description
tresize	tresize	TODO

This test does some resize thing thing

xperf

contact: :taras, :jhammel, :jmaher
source: [xperf instrumentation]
type: Pageloader (tp5n)
measuring: IO counters from windows
reporting: Summary of read/write counters for disk, network

Xperf runs tp5 while collecting xperf metrics for disk IO and network IO. The providers we listen for are:

['PROC_THREAD', 'LOADER', 'HARD_FAULTS', 'FILENAME', 'FILE_IO', 'FILE_IO_INIT']

The values we collect during stackwalk are:

['FileRead', 'FileWrite', 'FileFlush']

kraken

contact: :dmandelin, :jhammel, :jmaher
source: [kraken.manifest]
type: PageLoader
measuring: ???
reporting: ???
Perfomatic: Kraken Benchmark MozAfterPaint

this is the Kraken javascript benchmark taken verbatim and slightly modified to fit into our pageloader extension and talos harness.

V8, version 7

contact: :dmandelin, :jhammel, :jmaher
source: [v8.manifest]
type: PageLoader
measuring: ???
reporting: ???
Perfomatic name: V8 version 7 MozAfterPaint

this is the V8 (version 7) javascript benchmark taken verbatim and slightly modified to fit into our pageloader extension and talos harness.

The previous version of this test is V8 version 5 which was run on selective branches and operating systems.

Other data that is reported to graphserver and reported to dev.tree-management as talos failures

Linux

num_ctors: https://hg.mozilla.org/build/tools/file/348853aee492/buildfarm/utils/count_ctors.py

Linux, OSX 10.7

codesighs (build with --enable-codesighs)

Linux 64 debug, osx 10.7 debug, win debug

source is leaktest.py in the tree
trace_malloc_leaks
trace_malloc_maxheap
trace_malloc_alloc

Talos Test Types

There are two different species of Talos tests:

#Startup Tests : start up the browser and wait for either the load event or the paint event and exit, measuring the time
#Page Load Tests : load a manifest of pages

Startup Tests

Startup tests launch Firefox and measure the time to the onload or paint events. Firefox is invoked with a URL to:

http://hg.mozilla.org/build/talos/file/tip/talos/startup_test/startup_test.html for the onload event
http://hg.mozilla.org/build/talos/file/tip/talos/startup_test/tspaint_test.html for the paint event

Page Load Tests

Many of the talos tests use the page loader to load a manifest of pages. These are tests that load a specific page and measure the time it takes to load the page, scroll the page, draw the page etc. In order to run a page load test, you need a manifest of pages to run. The manifest is simply a list of URLs of pages to load, separated by carriage returns, e.g.:

http://www.mozilla.org
http://www.mozilla.com

Example: http://hg.mozilla.org/build/talos/file/tip/talos/page_load_test/svg/svg.manifest

Manifests may also specify that a test computes its own data by prepending a % in front of the line:

% http://www.mozilla.org
% http://www.mozilla.com

Example: http://hg.mozilla.org/build/talos/file/tip/talos/page_load_test/v8_7/v8.manifest

The file you created should be referenced in your config file, for example, open sample.config, and look for the line referring to the test you want to run:

- name: tp4
url: '-tp page_load_test/tp4.manifest -tpchrome -tpnoisy -tpformat tinderbox -tpcycles 10'

-tp controls the location of your manifest
-tpchrome tells Talos to run the browser with the normal browser UI active
-tpnoisy means "generate lots of output"
-tpformat controls the format of the results, they default to the results we send to displays like graphserver and tbpl.
-tpcycles controls the number of times we run the entire test.

Running Tp the Automation Way

In our automation, we run beneath many more restrictions than normal users. One of those restrictions is that our automation machines are walled off from the real-world networks. Because of this, and because we want to test pure page-loading and rendering time of Firefox, we serve the pages from localhost using Apache thus eliminating all network latency and uncertainty. You've probably noticed this if you looked at the talos/page_load_test/tp4.manifest.

To do this, we construct full downloads of sites in our manifest and they are placed on the automation slave at run time. Because we cannot at this time distribute our full page load test set, I'll walk through how these are set up and show you how to make your own. Note that our next version of the page load set will be distributable, so soon this won't be an issue.

In the meantime, here's the instructions:

Use this script or, you use the following wget command to fetch a page and everything it links to in order to have a complete page for offline use:

 $> wget -p -k -H -E -erobots=off --no-check-certificate -U "Mozilla/5.0 (firefox)" --restrict-file-names=windows --restrict-file-names=nocontrol $URL -o outputlog.txt

Once you have a cache of pages, install Apache:
```
 $> sudo apt-get install apache2
```
Copy your page into the proper location for Apache to serve it. Note that I like to create a page_load_test directory to separate talos from anything else on the webserver. So with Apache defaults, that's something like:
```
 $> mkdir /var/www/page_load_test; cp -R <dir> /var/www/page_load_test/.
```
Now, add the local URL into your manifest:
```
http://localhost/page_load_test/<dir>
```
Run the tp tests as above, pointing the config file at your manifest.

Paint Tests

Paint tests are measuring the time to receive both the MozAfterPaint and OnLoad event instead of just the OnLoad event.

Currently we run _paint tests for these tests:

ts_paint
tpaint
tp5n
sunspider
tdhtml
a11y
tscroll

NoChrome Tests

All tests run through the pageloader extension can be run with or without browser chrome. The tests load the same pages as described above in either case. The majority or tests are run with browser chrome enabled. On mobile (native android builds) we have to run everything as nochrome since we don't support additional xul windows.

The ability to run tests without the browser chrome opens up the ability to further isolate performance regressions.

Currently we run tdhtml as nochrome.

Adding a new test

Everybody wants moar tests, there is a lot that goes into adding a talos test:

file a bug to add appropriate rows to graphserver.
- file an additional IT bug to deploy the sql changes on the production and staging graph servers
- if this is adding new pages, ensure these sql changes include page definitions. NOTE: this one detail is usually forgotten since it is so rare
file a bug to add tests to talos.
create a talos.zip file and file a releng bug to upload it to the build network
create a patch for buildbot to add a definition of this new test and turn it on for the current branch
create a m-c (not inbound) patch to modify testing/talos/talos.json, get it reviewed and landed
create bugs for each time we uplift from mozilla-central->aurora->beta->release->esr to turn on your test
if this is an update to an existing test that could change the numbers, this needs to be treated as a new test and run side by side for a week to get a new baseline for the numbers.
file a bug to get tbpl updated with a new letter to track this test

While that is a laundry list of items to do, if you are developer of a component just talk to the a*team (jhammel or jmaher) and they will handle the majority of the steps above. When adding a new test, we really need to understand what we are doing. Here are some questions that you should know the answer to before adding a new test:

What does this test measure?
Does this test overlap with any existing test?
What is the unit of measurement that we are recording?
What would constitute a regression?
What is the expected range in the results over time?
Are there variables or conditions which would affect this test?
- browser configuration (prefs, environment variables)?
- OS, resources, time of day, etc... ?
Indepenedent of Observation? Will this test produce the same number regardless of what was run before it?
What considerations are there for how this test should be run and what tools are required?

Addon Testing

Talos currently runs a subset of tests against addons. For these tests the descriptions are the same as above (ts, tp) but the given addon is installed before test execution.

We attempt to have the correct preferences set for each individual addon to allow them to run correctly (skipping first-run pages, being enabled) and generate useful numbers. If preferences are missing please file a bug in [Testing/Talos].

All addon performance results are displayed on the [AddonTester waterfall] (including full log and any errors generated). The comparative results are found on the [Slow Performing Addons] mainpage.

Mobile testing

See the talos section of the main Android development page for details on this.

How are the numbers calculated?

To ensure that the base profile is correctly installed for every test the browser is opened/closed once before test execution. This first cold open is excluded for the test result calculation.

For "cold" tests the caches are cleared after this initial open/close - in this way the browser is configured and ready but returned to a "cold" state.

All tests are run with newly installed profiles - profiles are not shared across test runs.

Pageload style tests (tp5, tdhtml, etc)

The overall test number is determined by first calculating the median page load time for each page in the set (excluding the max page load per individual page). The max median from that set is then excluded and the average is taken; that becomes the number reported to the tinderbox waterfall.

Ts style tests (ts, twinopen, ts_cold, etc)

The overall test number is calculated by excluding the max opening time and taking an average of the remaining numbers.

Where are the numbers stored?

The results of every talos test are reported to the Perfomatic graph server. When running locally, you can specify output to a file using the --results_url argument to PerfConfigurator, e.g.

   PerfConfigurator --activeTests tsvg -e `which firefox` -o tsvg.yml --results_url file://${PWD}/tsvg.txt

Talos data formatting

Background Information

Naming convention

't' is pre-pended to the names to represent 'test'. Thus, ts = 'test startup', tp = 'test pageload', tdhtml = 'test dhtml'.

History of tp Tests

tp

The original tp test created by Mozilla to test browser page load time. Cycled through 40 pages. The pages were copied from the live web during November, 2000. Pages were cycled by loading them within the main browser window from a script that lived in content.

tp2/tp_js

The same tp test but loading the individual pages into a frame instead of the main browser window. Still used the old 40 page, year 2000 web page test set.

tp3

An update to both the page set and the method by which pages are cycled. The page set is now 393 pages from December, 2006. The pageloader is re-built as an extension that is pre-loaded into the browser chrome/components directories.

tp4

Updated web page test set to 100 pages from February 2009.

tp4m

This is a smaller pageset (21 pages) designed for mobile Firefox. This is a blend of regular and mobile friendly pages.

We landed on this on April 18th, 2011 in bug 648307. This runs for Android and Maemo mobile builds only.

tp5

Updated web page test set to 100 pages from April 8th, 2011. Effort was made for the pages to no longer be splash screens/login pages/home pages but to be pages that better reflect the actual content of the site in question.

Let's Run The Tests!

I have a patch for Talos, what tests to I run?

If you are making changes to talos obviously running the tests locally will be the first step. The next logical step is to run the tests on Try server (try: -b o -p all -u none -t all).

Testing Locally

Testing locally involves running some subset of the Talos tests on desktop and possibly mobile. Obviously not all permutations of tests and the ways of running them can be tried, so common sense should be used as to what is run. You may also want to run Talos' internal unittests: http://hg.mozilla.org/build/talos/file/tip/tests

You should tailor your choice of tests to pick those that cover what you've changed programmatically, but in general you should probably run at least one startup test and one pageloader test. A good baseline might be:

 # refer to running locally for more details
 talos -n -d -a ts:tsvg -e `which firefox` --develop --datazilla-url output.json --mozAfterPaint

Testing on Try Server

To speed up your development time and everybody else who uses Try server, there is no need to run all tests on all platforms unless this is a major change. Here are some guidelines to follow for testing patches to talos (or dependent modules):

If you patch touches one file and is very minor, get minimal testing on it (1 mobile test, 1 windows test, local testing): try: -b o -p win32,android -u none -t svgr,remote-tsvg
If you patch affects setup of the tests (config or profile) or launching of the process and is very minimal, get minimal testing on it (1 mobile test, 1 windows test, local testing): try: -b o -p win32,android -u none -t svgr,remote-tsvg
If your patch changes a test (add or edit), test that test locally (remember a new test will need graph server changes and buildbot changes)
If your patch changes mobile testing, test all mobile tests and 1 desktop test: (There is no good current way of doing this with try; I recommend two try runs:

try: -b o -p win32 -u none -t svgr

try: -b o -p android -u none -t all

)

If your patch changes results or output processing, run ts, tp5 on mobile, windows and locally: try: -b o -p win32,android -u none -t tpn,chromez,remote-ts,remote-tp4m_nochrome
If your patch changes any import statement (or code referenced in the import), please test on windows, linux, and mobile as these all run different versions of python: try: -b o -p linux,win32,android -u none -t all
If your patch changes a lot of stuff around, if you are not sure, or if you are going to deploy a new talos.zip, it is strongly recommended to run all talos tests: try: -b o -p all -u none -t all

If you are reviewing a talos patch, it is your responsibility to recommend the proper testing approach. If you think it needs more than these guidelines, call it out. If it needs less, call it out also.

Are my numbers ok

The best way to answer this question is to push to try server and compare the reported numbers from the logs (use tbpl as a log parser) and compare that with the [graph server]. I recommend using tbpl to open the link to the graphs.

If you are planning on landing on mozilla-central, look at tests from mozilla-central. Be aware of PGO vs Non PGO and Chrome vs Non Chrome. TBPL makes this a moderately pain free process (i.e. about 30 minutes). This is one of the big problems we are solving with datazilla.

Using try server

If you have access to generate try builds you can also have performance tests run against a custom version of talos. The performance results will be generated on the same machines that generate the talos results for all check-ins on all branches. This involves a few steps:

Run create_talos_zip.py from the root of your talos directory and upload the file somewhere that the build system can find it (e.g. http://people.mozilla.org/~wlachance/talos.zip)
Check out a copy of Mozilla central
Modify the file "testing/talos/talos.json" to point to the copy of talos you uploaded earlier
Push this change to try server using the right syntax (you can use TryChooser to help with this: recommended is to test talos thoroughly, but standard unit tests can be skipped)

A bit more information can be found in this blog post from armenzg: http://armenzg.blogspot.com/2011/12/taloszip-talosjson-and-you.html

Running locally - Source Code

http://hg.mozilla.org/build/talos/archive/tip.tar.gz may be installed in the usual manner for a python package (easy_install, etc). However, `pip` on windows fails due to the pywin32 dependency. See bug 787496

For the majority of the tests, we include test files and tools out of the box. We need to do these things:

clone talos:

hg clone http://hg.mozilla.org/build/talos

run the install script which
- creates a Virtualenv in the same directory as 'INSTALL.py'
- installs the talos python package into the virtualenv, including its MozBase dependencies via 'python setup.py develop'
- (alternatively, you can perform these steps yourself)
activate the virtualenv:

(on windows):

Scripts\activate.bat

(on osx/linux):

. bin/activate

unpack a copy of firefox somewhere (for this example, we'll use `which firefox` as if firefox was on your path)
setup a webserver if you're not using the '--develop' flag (WE STRONGLY RECOMMEND THE --develop FLAG)
- setup apache or similar webserver to have http://localhost -> the talos subdirectory of the talos checkout
- alternatively you can use the --develop flag to PerfConfigurator which configures to use a python webserver, mozhttpd, as shown below
generate a config script:

PerfConfigurator --develop --executablePath `which firefox` --activeTests ts --results_url ts.txt --output ts_desktop.yml --mozAfterPaint

- --develop indicates to run in develop mode and to set up a webserver for you
- --executablePath tells Talos where the firefox installation we want to run is located
  - we have `which firefox` as an example, you can use '~/mozilla/objdir/dist/bin/firefox' or whatever the full path is to your firefox executable that will be running the tests.
- --activeTests is a list of tests we want to run separated by ':'. In this example, we are running the startup test, ts.
- --results_url indicates a HTTP URL to POST results to or a file to append to
- --output is the new config file we want to generate

You can use PerfConfigurator --help to get a complete list of options

run tests

talos -n ts_desktop.yml

If you're looking to run remote talos, instructions are at: https://wiki.mozilla.org/Mobile/Fennec/Android#talos

We do not include the tp5 or similar pagesets for legal restrictions.

Talos will refuse to run if you have an open browser.

If you want to load an extension while running Talos, you want a few more command line arguments:

PerfConfigurator--executablePath=../firefox/firefox --sampleConfig=sample.config --activeTests=ts --extension=mozmill-1.5.2-sb+tb+fx+sm.xpi --addonID=mozmill@mozilla.com --output=my.config

--extension is the file of the XPI we want to install in the profile
--addonID is the ID of the addon to install

How Talos is Run in Production

buildbot constructs commands to launch PerfConfigurator from a config file: http://hg.mozilla.org/build/buildbot-configs/raw-file/tip/mozilla-tests/config.py ; there are a number of suites, each of which may contain multiple tests

a slave invokes run_tests.py on the generated YAML Talos configuration

Talos will run the tests and measured results which are uploaded to graphserver (after being suitably averaged per-page for Pageloader tests)
- (Talos also uploads raw results to datazilla)

the graphserver performs averaging across the pageset (for Pageloader tests) or across cycles (for Startup tests) and returns a number via HTTP to Talos which is then printed to the log

TBPL receives the name of the suite from buildbot. These are correlated to TBPL letters via http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/file/tip/js/Config.js . The computed results from graphserver are scraped from the log and displayed when a TBPL suite letter is clicked on in the lower right corner

Running locally - Standalone Talos (DEPRECATED)

Talos, the pageloader, and a distributable web page test for tp has been packaged together into Standalone Talos. Following the given directions you will be able to run all the talos tests on your local machine. The results you collect will not be directly comparable to production talos runs on Mozilla machines, but by testing browser with/without whatever changes you are interested in you should be able to get an initial check on any performance regressions.

StandaloneTalos is deprecated: https://bugzilla.mozilla.org/show_bug.cgi?id=714659 and see https://wiki.mozilla.org/Buildbot/Talos#Running_locally_-_Source_Code for how to run modern Talos

Bugs

Talos bugs are filed under Testing/Talos, such as requests for new tests or repairs to the talos code itself.

Talos bugs that need to be staged and checked in are marked talos-checkin-needed in the whiteboard

Graph server bugs are filed under Webtools/Graph server.

Talos machine maintenance bugs are filed under mozilla.org/Release Engineering, such as bugs having to do with the hardware that talos is run on or requests to run extra talos tests against a given build.

A 2012Q1 effort is to get Talos on mozharness in production. See the tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=713055

Other usage of Talos

https://wiki.mozilla.org/Auto-tools/Projects/JetPerf
compare-talos : https://bitbucket.org/mconnor/compare-talos deployed at http://perf.snarkfest.net/compare-talos/

Happy Talos testing!

Buildbot/Talos

Talos

The Code

The Test Machines

Order of Operations

Regressions

Talos Tests

tp5

Private Bytes

RSS (Resident Set Size)

Xres (X Resource Monitoring)

Working Set (tp5_memset)

Modified Page List Bytes

% CPU

Responsiveness

ts_paint

ts_places_generated_med

ts_places_generated_max

tdhtml

tsvg

tsvg-opacity

tpaint

sunspider

JSS/Domaeo Tests

Dromaeo CSS

Dromaeo DOM

Dromaeo JSLIB

Dromaeo SunSpider

Dromaeo Basics

Dromaeo V8

Trace Malloc

a11y

tscroll

tresize

xperf

kraken

V8, version 7

Other data that is reported to graphserver and reported to dev.tree-management as talos failures

Talos Test Types

Startup Tests

Page Load Tests

Running Tp the Automation Way

Paint Tests

NoChrome Tests

Adding a new test

Addon Testing

Mobile testing

How are the numbers calculated?

Pageload style tests (tp5, tdhtml, etc)

Ts style tests (ts, twinopen, ts_cold, etc)

Where are the numbers stored?

Background Information

Naming convention

History of tp Tests

tp

tp2/tp_js

tp3

tp4

tp4m

tp5

Let's Run The Tests!

I have a patch for Talos, what tests to I run?

Testing Locally

Testing on Try Server

Are my numbers ok

Using try server

Running locally - Source Code

How Talos is Run in Production

Running locally - Standalone Talos (DEPRECATED)

Bugs

Other usage of Talos

Blog posts about Talos