Buildbot/Talos/Tests

From MozillaWiki
< Buildbot‎ | Talos
Revision as of 18:28, 12 December 2012 by Jhammel (talk | contribs) (→‎Where to get this information: remove obselete link and note --print-tests)
Jump to navigation Jump to search

Talos Tests

Where to get this information

A table detailing information flow from buildbot to talos to TBPL and graphserver is available at http://k0s.org:8080/ . This is generated with the talosnames script, as detailed in http://k0s.org/mozilla/blog/20120724135349 . See also bug 770460.

Talos Test Types

There are two different species of Talos tests:

  • #Startup Tests : start up the browser and wait for either the load event or the paint event and exit, measuring the time
  • #Page Load Tests : load a manifest of pages

Startup Tests

Startup tests launch Firefox and measure the time to the onload or paint events. Firefox is invoked with a URL to:

Page Load Tests

Many of the talos tests use the page loader to load a manifest of pages. These are tests that load a specific page and measure the time it takes to load the page, scroll the page, draw the page etc. In order to run a page load test, you need a manifest of pages to run. The manifest is simply a list of URLs of pages to load, separated by carriage returns, e.g.:

http://www.mozilla.org
http://www.mozilla.com

Example: http://hg.mozilla.org/build/talos/file/tip/talos/page_load_test/svg/svg.manifest

Manifests may also specify that a test computes its own data by prepending a % in front of the line:

% http://www.mozilla.org
% http://www.mozilla.com

Example: http://hg.mozilla.org/build/talos/file/tip/talos/page_load_test/v8_7/v8.manifest

The file you created should be referenced in your config file, for example, open sample.config, and look for the line referring to the test you want to run:

- name: tp4
url: '-tp page_load_test/tp4.manifest -tpchrome -tpnoisy -tpformat tinderbox -tpcycles 10'
  • -tp controls the location of your manifest
  • -tpchrome tells Talos to run the browser with the normal browser UI active
  • -tpnoisy means "generate lots of output"
  • -tpformat controls the format of the results, they default to the results we send to displays like graphserver and tbpl.
  • -tpcycles controls the number of times we run the entire test.

Paint Tests

Paint tests are measuring the time to receive both the MozAfterPaint and OnLoad event instead of just the OnLoad event.

Currently we run _paint tests for these tests:

  • ts_paint
  • tpaint
  • tp5n
  • sunspider
  • tdhtml
  • a11y
  • tscroll

NoChrome Tests

All tests run through the pageloader extension can be run with or without browser chrome. The tests load the same pages as described above in either case. The majority or tests are run with browser chrome enabled. On mobile (native android builds) we have to run everything as nochrome since we don't support additional xul windows.

The ability to run tests without the browser chrome opens up the ability to further isolate performance regressions.

Currently we run tdhtml as nochrome.

Test Descriptions

tp5

  • contact: :jhammel, :jmaher
  • source: not available
  • type: PageLoader
Talos test name Graphserver name Description
tp5r Tp5r MozAfterPaint tp5 with responsiveness
tp5row Tp5 Row Major MozAfterPaint tp5r running in Row Major with 25 cycles/page, ignoring the first 5
tp5n Tp5 No Network Row Major MozAfterPaint tp5row with a new tp5.zip that has no 404s and no external network access
tp5 Tp5 MozAfterPaint Measures the time to load a webpage and receive [both a MozAfterPaint and OnLoad event].

Tests the time it takes Firefox to load the tp5 web page test set. The web set was culled from the Alexa top 500 April 8th, 2011 and consists of 100 pages.

Unfortunately, we do not distribute a copy of the set of test web pages as these would not constitute fair use. Here are the broad steps we use to create the test set:

  1. Take the Alexa top 500 sites list
  2. Remove all sites with questionable or explicit content
  3. Remove duplicate site (for ex. many Google search front pages)
  4. Manually select to keep interesting pages (such as pages in different locales)
  5. Select a more representative page from any site presenting a simple search/login/etc. page
  6. Deal with Windows 255 char limit for cached pages
  7. Limit test set to top 100 pages

Note that the above steps did not eliminate all outside network access so we had to take further action to scrub all the pages so that there are 0 outside network accesses (this is done so that the tp test is as deterministic measurement of our rendering/layout/paint process as possible). If you are on the Mozilla intranet, you can obtain the current page set for local testing. DO NOT DISTRIBUTE IT.

Private Bytes

A memory metric tracked during tp4 test runs. This metric is sampled every 20 seconds.

For windows, a description from Microsoft TechNet.

RSS (Resident Set Size)

A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on linux/mac only.

Description from wikipedia.

Xres (X Resource Monitoring)

A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on linux only.

xres man page.

Working Set (tp5_memset)

A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on windows only. Description from Microsoft TechNet.

Modified Page List Bytes

A memory metric tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on Windows7 only. Description from Microsoft MSDN.

% CPU

Cpu usage tracked during tp5 test runs. This metric is sampled every 20 seconds. This metric is collected on windows only.

Responsiveness

Reports the delay in milliseconds for the event loop to process a tracer event. For more details, see bug 631571.

ts_paint

  • contact: :mak, :jimm, :jhammel, :jmaher
  • source: tspaint_test.html
  • Perfomatic: "Ts, Paint"
  • type: Startup

Launches tspaint_test.html with the current timestamp in the url, waits for [MozAfterPaint and onLoad] to fire, then records the end time and calculates the time to startup.

The basic ts test uses a blank profile. Formerly known as ts before we looked for the MozAfterPaint event.

ts_places_generated_med

  • contact: :mak, :jhammel, :jmaher
  • source: tspaint_test.html
  • type: Startup
  • dirty: this is also referred to as the dirty test

Runs the same test as ts_paint, but uses a generated profile to simulate what an average user would have. This profile is very outdated and needs to be updated.

ts_places_generated_max

  • contact: :mak, :jhammel, :jmaher
  • source: tspaint_test.html
  • type: Startup
  • dirty: this is also referred to as the dirty test

Runs the same test as ts_paint, but uses a generated profile to simulate what a power user would have. This profile is very outdated and needs to be updated.

tdhtml

  • turned off on all branches and platforms November 1st, 2012
  • contact: :peterv, :jhammel, :jmaher
  • source: dhtml.manifest
  • type: PageLoader
Talos test name Graphserver name Description
tdhtmlr DHTML Row Major Row based and 25 cycles/page.
tdhtml.2 DHTML 2 Ignoring the first value instead of the highest (usually the highest is the first)

Tests which measure the time to cycle through a set of DHTML test pages. This test will be updated in the near future.

This test is also ran with the nochrome option.

tsvg

  • contact: :jwatt, :jhammel, :jmaher
  • source: svg.manifest
  • type: PageLoader
Talos test name Graphserver name Description
svgr SVG Row Major Row Major and 25 cycles/page.
svg SVG Column based and 5 cycles.

An svg-only number that measures SVG rendering performance.

tsvg-opacity

  • contact: :jwatt, :jhammel, :jmaher
  • source: svg.manifest
  • type: PageLoader
Talos test name Graphserver name Description
svgr_opacity SVG, Opacity Row Major Row Major and 25 cycles/page.
svg_opacity SVG, Opacity Column based and 5 cycles.

An svg-only number that measures SVG rendering performance.

tpaint

Talos test name Graphserver name Description
tpaint Paint twinopen but measuring the time after we receive the [MozAfterPaint and OnLoad event].
twinopen original test to measure the time to open window based on OnLoad event.
txul Txul another name for twinopen. Also we report txul in the regression emails.

Tests the amount of time it takes the open a new window. This test does not include startup time. Multiple test windows are opened in succession, results reported are the average amount of time required to create and display a window in the running instance of the browser. (Measures ctrl-n performance.)

sunspider

  • contact: :dmandelin, :jhammel, :jmaher
  • source: [sunspider.manifest]
  • type: PageLoader
  • measuring: ???
  • reporting: ???
Talos test name Graphserver name Description
sunspider SunSpider 0.9.1 updated sunspider version 0.9.1 test suite.
tsspider.2 SunSpider 2 same as tsspider, but ignore the first value instead of the largest.
tsspider SunSpider original sunspider test (version unknown), run 5 cycles

This is the sunspider javascript benchmark taken verbatim and slightly modified to fit into our pageloader extension and talos harness.

The previous version of this test is tsspider which is deprecated now.

JSS/Domaeo Tests

Dromaeo suite of tests for JavaScript performance testing. See the Dromaeo wiki for more information.

This suite is divided into several sub-suites.

Dromaeo CSS

contact: :dmandelin, :jhammel, :jmaher source: [css.manifest] type: PageLoader

Each page in the manifest is part of the dromaemo css benchmark.

Dromaeo DOM

contact: :dmandelin, :jhammel, :jmaher source: [dom.manifest] type: PageLoader

Each page in the manifest is part of the dromaemo css benchmark.

Dromaeo JSLIB

The set of pages is loaded sequentially once.

The list of pages is found here.

The test pages themselves are here.

Deprecated as of June 30th, 2012

Dromaeo SunSpider

The set of pages is loaded sequentially once.

The list of pages is found here.

The test pages themselves are here.

Deprecated as of June 30th, 2012

Dromaeo Basics

The set of pages is loaded sequentially once.

The list of pages is found here.

The test pages themselves are here.

Deprecated as of June 30th, 2012

Dromaeo V8

The set of pages is loaded sequentially once.

The list of pages is found here.

The test pages themselves are here.

Deprecated as of June 30th, 2012

Trace Malloc

This test uses the trace-malloc tool from tools/trace-malloc to wrap calls to malloc and log information about every memory allocation.

a11y

  • contact: :davidb, :tbsaunde, :jhammel, :jmaher
  • source: [a11y.manifest]
  • type: PageLoader
  • measuring: ???
  • reporting: ???
Talos test name Graphserver name Description
a11yr a11y Row Major MozAfterPaint Row Major testing with 25 cycles per page
a11y.2 a11y 2 MozAfterPaint same as a11y ignoring the first value collected instead of the largest
a11y a11y MozAfterPaint iterate through each page, 5 cycles through the list, ignore the highest value from each page

This test ensures basic a11y tables and permutations do not cause performance regressions.

tscroll

  • contact: :jrmuizel, :jhammel, :jmaher
  • source: [scroll.manifest]
  • type: PageLoader
  • measuring: ???
  • reporting: ???
Talos test name Graphserver name Description
tscrollr tscroll Row Major Row Major testing with 25 cycles
tscroll.2 tscroll 2 Ignore the first value for each page instead of the largest
tscroll tscroll run through each page in the manifest and cycle 5 times. For each page, ignore the largest value

This test does some scrolly thing

tresize

  • contact: :jimm, :jmaher
  • source: [tresize-test.html]
  • type: StartupTest
  • measuring: Time to do XUL resize
  • reporting: ???
Talos test name Graphserver name Description
tresize tresize TODO

This test does some resize thing thing

xperf

  • contact: :taras, :jhammel, :jmaher
  • source: [xperf instrumentation]
  • type: Pageloader (tp5n)
  • measuring: IO counters from windows
  • reporting: Summary of read/write counters for disk, network

Xperf runs tp5 while collecting xperf metrics for disk IO and network IO. The providers we listen for are:

The values we collect during stackwalk are:

kraken

  • contact: :dmandelin, :jhammel, :jmaher
  • source: [kraken.manifest]
  • type: PageLoader
  • measuring: ???
  • reporting: ???
  • Perfomatic: Kraken Benchmark MozAfterPaint

this is the Kraken javascript benchmark taken verbatim and slightly modified to fit into our pageloader extension and talos harness.

V8, version 7

  • contact: :dmandelin, :jhammel, :jmaher
  • source: [v8.manifest]
  • type: PageLoader
  • measuring: ???
  • reporting: ???
  • Perfomatic name: V8 version 7 MozAfterPaint

this is the V8 (version 7) javascript benchmark taken verbatim and slightly modified to fit into our pageloader extension and talos harness.

The previous version of this test is V8 version 5 which was run on selective branches and operating systems.

Other data that is reported to graphserver and reported to dev.tree-management as talos failures

Linux

Linux, OSX 10.7

  • codesighs (build with --enable-codesighs)

Linux 64 debug, osx 10.7 debug, win debug

  • source is leaktest.py in the tree
  • trace_malloc_leaks
  • trace_malloc_maxheap
  • trace_malloc_alloc