CI Automation/windows10 aarch64: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
 
(24 intermediate revisions by 2 users not shown)
Line 1: Line 1:
=Overview=
=Overview=
Since mid-January 2019 the CI-A team has been working to enable existing test harnesses, continuous integration tests and other tools to run on Windows 10 ARM64.
Since mid-January 2019 the [https://wiki.mozilla.org/CI_Automation CI-A team] has been working to enable existing test harnesses, continuous integration tests and other tools to run on Windows 10 ARM64, aka aarch64.


= General Information =
= General Information =
Line 15: Line 15:
== Hosting ==
== Hosting ==


Currently an array of 9 machines are hosted at [https://bitbar.com/ Bitbar] in the United States.
Currently an array of ~30 machines are hosted at [https://bitbar.com/ Bitbar] in the United States.


= Setup =
= Setup - local environment =
Developers wishing to run tests locally have two methods.


Tests that are run against windows10-aarch64 execute using [https://github.com/taskcluster/generic-worker Taskcluster Generic-Worker]. These are installed as a service on the Windows 10 ARM64 manually or via [https://github.com/mozilla-releng/OpenCloudConfig OpenCloudConfig].
== Prequisites ==


A brief walkthrough of the steps to have Taskcluster Generic-Worker running on Windows 10 ARM64 will be provided.
# download and install [https://ftp.mozilla.org/pub/mozilla/libraries/win32/MozillaBuildSetup-2.2.0.exe Mozilla-Build 2.2.0]


== Using only generic-worker ==
=== Using mozilla-build ===


Follow this step to install Taskcluster Generic-Worker on the hardware, and have it launch as a service. After following these steps, the hardware should be ready to accept any tasks started on Taskcluster.  
This method uses a script to download test archives in order to run tests locally.
 
# download <code>script for running mozharness on Yoga</code> from [https://bugzilla.mozilla.org/show_bug.cgi?id=1520867 bug 1520867]
# place the test runner script in the <code>C:\mozilla-build</code> directory
# from treeherder, identify a changeset that contains a successful <code>build-win64-aarch64/opt</code>
# copy the task ID of the build
# invoke start-shell.bat, which will launch a bash-like commandline
# from mozilla-build directory, run the test runner script as follows:
<code>bash script.sh task_id test_type <chunk_to_run> <total_chunks></code>
 
Example:
<code>bash script.sh Q-CE8DFvSAWmc08vw6bd6A xpcshell 1 8</code>
 
=== Using mozilla-central ===
 
This method is taken from [https://www.gijsk.com/blog/2019/02/getting-firefox-artifact-builds-working-on-an-arm64-aarch64-windows-device/ this guide] and uses mozilla-central with a build artifact.
 
# invoke start-shell.bat, which will launch a bash-like commandline
# clone the repository using <code>hg clone https://hg.mozilla.org/mozilla-central/</code>
# run <code>./mach bootstrap</code> and pick artifact build
# download python3 embeddable zip, then extract to <code>mozilla-build/</code> directory
# remove [https://searchfox.org/mozilla-central/rev/152993fa346c8fd9296e4cd6622234a664f53341/python/mozboot/mozboot/bootstrap.py#444 this line]
# download 32bit NodeJS zip and extract to <code>.mozbuild/node</code>
# inside mozilla-build, remove the directory named <code>watchman</code>
# rerun <code>./mach bootstrap</code>
# run <code>./mach build</code>
 
After the artifact build succeeds, it is possible to run most suites of tests as normal:
<code>./mach mochitest <test_file></code>
 
= CI environment =
 
Tests that are run in Taskcluster environment against windows10-aarch64 execute using [https://github.com/taskcluster/generic-worker Taskcluster Generic-Worker]. These are installed as a service on via [https://github.com/mozilla-releng/OpenCloudConfig OpenCloudConfig].
 
== Using OpenCloudConfig ==
 
This is the method used in production.
 
Steps originally taken from [https://bugzilla.mozilla.org/show_bug.cgi?id=1520432#c2 1520432].
 
$gitBranchOrRef = 'master'
Invoke-Expression (New-Object Net.WebClient).DownloadString(('https://raw.githubusercontent.com/mozilla-releng/OpenCloudConfig/{0}/userdata/rundsc.ps1?{1}' -f $gitBranchOrRef, [Guid]::NewGuid()))
 
== Manually install Generic-Worker [Not recommended] ==
 
Follow these step to install Taskcluster Generic-Worker on the hardware, and have it launch as a service.


Instruction originally from [https://bugzilla.mozilla.org/show_bug.cgi?id=1522997#c2 1522997].
Instruction originally from [https://bugzilla.mozilla.org/show_bug.cgi?id=1522997#c2 1522997].


=== Prerequisites ===
'''Prerequisites'''
* disable Windows S mode
* disable Windows S mode
* disable User Account Control
* disable User Account Control
Line 39: Line 85:
* request scope `assume:project:taskcluster:generic-worker-tester`  
* request scope `assume:project:taskcluster:generic-worker-tester`  


=== Steps ===
'''Steps'''
 
# download the current 386 release of `generic-worker-windows-386.exe` from [https://github.com/taskcluster/generic-worker/releases taskcluster generic-worker].
# download the current 386 release of `generic-worker-windows-386.exe` from [https://github.com/taskcluster/generic-worker/releases taskcluster generic-worker].
# download the latest 386 version of livelog.exe and taskcluster-proxy.exe.
# download the latest 386 version of livelog.exe and taskcluster-proxy.exe.
Line 54: Line 99:
  "ed25519SigningKeyLocation":  "<file location you wrote ed25519 private key in step 6>",
  "ed25519SigningKeyLocation":  "<file location you wrote ed25519 private key in step 6>",
  "livelogSecret":              "<any text>",
  "livelogSecret":              "<any text>",
"openpgpSigningKeyLocation":  "<file location you wrote gpg private key kn step 6>",
  "provisionerId":              "test-provisioner",
  "provisionerId":              "test-provisioner",
  "publicIP":                  "<ideally an IP address of one of your network interfaces>",
  "publicIP":                  "<ideally an IP address of one of your network interfaces>",
Line 68: Line 112:
# sc query "Generic Worker"
# sc query "Generic Worker"


== Using OpenCloudConfig ==
= Currently running on CI =


This is the method that is used in production.
Currently, a limit subset of tests are running regularly on <code>mozilla-central</code> and <code>try</code>. This is to reduce the load on the windows10-aarch64 hardware, which is limited in number.


Steps originally taken from [https://bugzilla.mozilla.org/show_bug.cgi?id=1520432#c2 1520432].
= Run on try =


# Invoke-Expression (New-Object Net.WebClient).DownloadString('https://raw.githubusercontent.com/mozilla-releng/OpenCloudConfig/aarch64/userdata/rundsc.ps1')
This is probably what you came to the document for. How to run tests against the windows10-aarch64 hardware currently available.  


= Currently Running =
'''Hardware is limited so please exercise caution when scheduling tests! A careless try will block many others. Only schedule jobs that are absolutely necessary.'''


Currently supported list of tests include:
== Prerequisites ==


* awsy
* try access (commit access level 1)
* mochitest (all flavors, including e10s)
* up-to-date mozilla-central codebase
* web-platform-tests (all flavors)
* reftests (including crashtest, jsreftest)
* xpcshell


Supported, requires non-artifact build:
== Steps ==


* jittest
Note that on <code>try</code>, windows10-aarch64 is hidden by default; please use <code>./mach try fuzzy --full</code> to schedule jobs.
* gtests
* cppunittest


There is remaining work needed to get these test suites running:
# <code>./mach try fuzzy --full</code>
* talos
# select tests that need to be run (e.g. 'windows10-aarch64 xpcshell')
* raptor
# enter
* marionette


For an up-to-date list of tests, please refer to [https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-platforms.yml#222 this file].
Tests will appear in Treeherder under the heading ''Windows 10 AArch64 opt''.


= Run tests Locally =
= Greening tests =
Theoretically, you can run tests locally with mach from a local build environment. However, since our aarch64 builds are usually cross-compiled in an x86 environment, you probably don't have a local build environment!


The recommended alternative is to use mozharness to download, install, and test a build from try or continuous integration. A handy script is provided as an attachment to {{bug|1520867}} that greatly simplifies running tests from mozharness; let's call that script 'moztest'.
Since Windows on ARM64 is a new platform/architecture combination, failures unique to this combination is to be expected. It will be necessary to fix, correct or update the tests in order to obtain a green run.


Run moztest from a MozillaBuild shell. You need only a few parameters:
== Example 1 ==
* The task-id of the Windows-aarch64 build that you want to test: Click on the aarch64 build in treeherder, and copy the "Task" shown in the treeherder detail pane; it might look like "Q-CE8DFvSAWmc08vw6bd6A".
* The name of the test suite you want to run: one of (cppunit, gtest, xpcshell, mochitest, mochitest-chrome, mochitest-clipboard, mochitest-dt, mochitest-gpu, mochitest-media, crashtest, jsreftest, reftest, jittest, web-platform, web-platform-reftest, web-platform-wdspec, raptor-speedometer, raptor-tp6, talos-g5, talos-chromez)
* Optionally, the test "chunk" number to run and the number of test chunks to split the suite into.


For example:
As part of [https://bugzilla.mozilla.org/show_bug.cgi?id=1525743 1525743], the timeout for mochitest-browser-chrome was extended to 4x the default value if the platform combination of Windows and ARM64 is detected.
* moztest Q-CE8DFvSAWmc08vw6bd6A cppunit
* moztest Q-CE8DFvSAWmc08vw6bd6A xpcshell 1 3


= Run tests on Try =
See change: https://phabricator.services.mozilla.com/D19882


This is probably what you came to the document for. How to run tests against the windows10-aarch64 hardware currently available. Note, the number of hardware is limited so please exercise caution when scheduling tests.
This change greened the test that was previously failing due to a timeout.


== Overview ==
== Example 2 ==


Follow these steps to be able to enable windows10-aarch64 tests for the try server. These steps are required as of 2019-02-25; it will become obsolete when windows10-aarch64 tests are released to the general public.
Some tests provide a manifest file in the form of <test_category>.ini, such as ''mochitest.ini''.


=== Prerequisites ===
For [https://bugzilla.mozilla.org/show_bug.cgi?id=1525665 bug 1525665] it was determined to disable a certain a11y test while windows10-aarch64 a11y support was being investigated.


* try access (commit access level 1)
See change: https://phabricator.services.mozilla.com/D22363
* up-to-date mozilla-central codebase


=== Steps ===
This change meant the failing test is now disabled for windows10-aarch64, and the test would have been green had it not been for another failure elsewhere.


# open the file at taskcluster/ci/test/test-platforms.yml
== Example 3 ==
# search for 'windows10-aarch64/opt'
# uncomment all or some of the items under 'test-sets'
# make changes to the local codebase that needs testing
# ./mach try fuzzy
# select tests that need to be run (e.g. 'windows10-aarch64 xpcshell')
# enter


Tests will appear in Treeherder under the heading ''windows10-aarch64 opt''.
Another example of manipulating the manifest of a category of tests, this time with ''web-platform-tests''.


= Greening tests =
For [https://bugzilla.mozilla.org/show_bug.cgi?id=1533912 bug 1533912], the manifest was modified to disable the test if it was running on aarch64 hardware.


Since Windows on ARM64 is a new platform/architecture combination, failures unique to this combination is to be expected. It will be necessary to fix, correct or update the tests in order to obtain a green run.
See change: https://phabricator.services.mozilla.com/D23003
 
== Example 1 ==


As part of [https://bugzilla.mozilla.org/show_bug.cgi?id=1525743 1525743], the timeout for mochitest-browser-chrome was extended to 4x the default value if the platform combination of Windows and ARM64 is detected.
Note that web-platform-tests use a slightly different format in order.


See change: https://phabricator.services.mozilla.com/D19882
== Example 4 ==


This change greened the test that was previously failing due to a timeout.
Certain test cases in reftest/crashtest/jsreftest had unexpected outcomes on windows10-aarch64.


== Example 2 ==
For [https://bugzilla.mozilla.org/show_bug.cgi?id=1536365 bug 1536365] and [https://bugzilla.mozilla.org/show_bug.cgi?id=1536363 bug 1536363], the requirement was to adjust the pixel-difference values such that tests will pass.


<TODO add instructions for reftest manifest, manifest parser, and wpt manifest for skip-if>
See change: https://phabricator.services.mozilla.com/D25113


= Bugs =
= Bugs =


These are the top-level tracking bugs; the recommended view is [https://bugzilla.mozilla.org/showdependencytree.cgi?id=1522997&hide_resolved=0 tree] (login required).
These are the top-level tracking bugs; the recommended view is [https://bugzilla.mozilla.org/showdependencytree.cgi?id=1522997&hide_resolved=0 tree] (login required).
CI-A team will make efforts to re-test disabled tests on a semi-regular basis, or whenever fixes are committed to components that had tests disabled.


<bugzilla>
<bugzilla>

Latest revision as of 21:35, 3 April 2020

Overview

Since mid-January 2019 the CI-A team has been working to enable existing test harnesses, continuous integration tests and other tools to run on Windows 10 ARM64, aka aarch64.

General Information

Hardware

  • Make: Lenovo
  • Model: C630 YOGA
  • Processor: Qualcomm Snapdragon 850 3.0GHz
  • Cores: 8
  • Memory: 8GB
  • Disk: 128GB SSD

Hosting

Currently an array of ~30 machines are hosted at Bitbar in the United States.

Setup - local environment

Developers wishing to run tests locally have two methods.

Prequisites

  1. download and install Mozilla-Build 2.2.0

Using mozilla-build

This method uses a script to download test archives in order to run tests locally.

  1. download script for running mozharness on Yoga from bug 1520867
  2. place the test runner script in the C:\mozilla-build directory
  3. from treeherder, identify a changeset that contains a successful build-win64-aarch64/opt
  4. copy the task ID of the build
  5. invoke start-shell.bat, which will launch a bash-like commandline
  6. from mozilla-build directory, run the test runner script as follows:

bash script.sh task_id test_type <chunk_to_run> <total_chunks>

Example: bash script.sh Q-CE8DFvSAWmc08vw6bd6A xpcshell 1 8

Using mozilla-central

This method is taken from this guide and uses mozilla-central with a build artifact.

  1. invoke start-shell.bat, which will launch a bash-like commandline
  2. clone the repository using hg clone https://hg.mozilla.org/mozilla-central/
  3. run ./mach bootstrap and pick artifact build
  4. download python3 embeddable zip, then extract to mozilla-build/ directory
  5. remove this line
  6. download 32bit NodeJS zip and extract to .mozbuild/node
  7. inside mozilla-build, remove the directory named watchman
  8. rerun ./mach bootstrap
  9. run ./mach build

After the artifact build succeeds, it is possible to run most suites of tests as normal: ./mach mochitest <test_file>

CI environment

Tests that are run in Taskcluster environment against windows10-aarch64 execute using Taskcluster Generic-Worker. These are installed as a service on via OpenCloudConfig.

Using OpenCloudConfig

This is the method used in production.

Steps originally taken from 1520432.

$gitBranchOrRef = 'master'
Invoke-Expression (New-Object Net.WebClient).DownloadString(('https://raw.githubusercontent.com/mozilla-releng/OpenCloudConfig/{0}/userdata/rundsc.ps1?{1}' -f $gitBranchOrRef, [Guid]::NewGuid()))

Manually install Generic-Worker [Not recommended]

Follow these step to install Taskcluster Generic-Worker on the hardware, and have it launch as a service.

Instruction originally from 1522997.

Prerequisites

  • disable Windows S mode
  • disable User Account Control
  • disable Windows Firewall
  • download NSSM to C:\nssm-2.24\
  • create "Remote Desktop Users" group:
net localgroup "Remote Desktop Users" /add
  • log in to Taskcluster
  • request scope `assume:project:taskcluster:generic-worker-tester`

Steps

  1. download the current 386 release of `generic-worker-windows-386.exe` from taskcluster generic-worker.
  2. download the latest 386 version of livelog.exe and taskcluster-proxy.exe.
  3. create new directory C:\generic-worker.
  4. move the three executable files under C:\generic-worker.
  5. rename generic-worker-windows-386.exe to generic-worker.exe.
  6. generate two signing keys:
generic-worker new-openpgp-keypair --file <unique_file_name>
generic-worker new-ed25519-keypair --file <unique_file_name>
  1. create generic-worker.config and include the following:
"accessToken":                "<access token tied to taskcluster>",
"clientId":                   "<client ID tied to taskcluster>",
"ed25519SigningKeyLocation":  "<file location you wrote ed25519 private key in step 6>",
"livelogSecret":              "<any text>",
"provisionerId":              "test-provisioner",
"publicIP":                   "<ideally an IP address of one of your network interfaces>",
"rootURL":                    "https://taskcluster.net",
"workerGroup":                "test-worker-group",
"workerId":                   "test-worker-id",
"workerType":                 "<a unique string that only you will use for your test worker(s)>"
  1. launch cmd.exe with Administrator rights.
  2. cd c:\generic-worker
  3. generic-worker.exe install service --config generic-worker.config --nssm c:\nssm-2.24\win32\nssm.exe
  4. reboot once installed.
  5. launch cmd.exe with Administrator rights.
  6. sc query "Generic Worker"

Currently running on CI

Currently, a limit subset of tests are running regularly on mozilla-central and try. This is to reduce the load on the windows10-aarch64 hardware, which is limited in number.

Run on try

This is probably what you came to the document for. How to run tests against the windows10-aarch64 hardware currently available.

Hardware is limited so please exercise caution when scheduling tests! A careless try will block many others. Only schedule jobs that are absolutely necessary.

Prerequisites

  • try access (commit access level 1)
  • up-to-date mozilla-central codebase

Steps

Note that on try, windows10-aarch64 is hidden by default; please use ./mach try fuzzy --full to schedule jobs.

  1. ./mach try fuzzy --full
  2. select tests that need to be run (e.g. 'windows10-aarch64 xpcshell')
  3. enter

Tests will appear in Treeherder under the heading Windows 10 AArch64 opt.

Greening tests

Since Windows on ARM64 is a new platform/architecture combination, failures unique to this combination is to be expected. It will be necessary to fix, correct or update the tests in order to obtain a green run.

Example 1

As part of 1525743, the timeout for mochitest-browser-chrome was extended to 4x the default value if the platform combination of Windows and ARM64 is detected.

See change: https://phabricator.services.mozilla.com/D19882

This change greened the test that was previously failing due to a timeout.

Example 2

Some tests provide a manifest file in the form of <test_category>.ini, such as mochitest.ini.

For bug 1525665 it was determined to disable a certain a11y test while windows10-aarch64 a11y support was being investigated.

See change: https://phabricator.services.mozilla.com/D22363

This change meant the failing test is now disabled for windows10-aarch64, and the test would have been green had it not been for another failure elsewhere.

Example 3

Another example of manipulating the manifest of a category of tests, this time with web-platform-tests.

For bug 1533912, the manifest was modified to disable the test if it was running on aarch64 hardware.

See change: https://phabricator.services.mozilla.com/D23003

Note that web-platform-tests use a slightly different format in order.

Example 4

Certain test cases in reftest/crashtest/jsreftest had unexpected outcomes on windows10-aarch64.

For bug 1536365 and bug 1536363, the requirement was to adjust the pixel-difference values such that tests will pass.

See change: https://phabricator.services.mozilla.com/D25113

Bugs

These are the top-level tracking bugs; the recommended view is tree (login required).

CI-A team will make efforts to re-test disabled tests on a semi-regular basis, or whenever fixes are committed to components that had tests disabled.

Full Query
ID Summary Priority Status
1520867 Investigate running tests on Windows / arm64 P1 RESOLVED
1523722 Run gtest using generic-worker on Windows/aarch64 P3 RESOLVED
1524114 Run xpcshell-test using generic-worker on Windows/aarch64 P3 RESOLVED
1524400 Run mochitest using generic-worker on windows/aarch64 P3 RESOLVED
1524410 Run reftest suites using generic-worker on windows/aarch64 P3 RESOLVED
1525118 [meta] Run taskcluster task from mach try on Bitbar -- RESOLVED
1525434 Run web-platform-test suite using generic-worker on windows/aarch64 P3 RESOLVED
1526015 Run cppunit, jittest, marionette using generic-worker on Windows/aarch64 P3 RESOLVED
1527177 Intermittent [taskcluster:error] [mounts] reading file in zip archive: file already exists: Z:\task_1549919043\mozharness\LICENSE P5 RESOLVED
1527469 Enable windows10-aarch64 build and tests on try server -- RESOLVED
1530737 unable to run talos/raptor on win/aarch64 builds in CI -- RESOLVED
1531876 run talos/raptor tests on windows10 aarch64 laptops P1 RESOLVED
1531878 [taskcluster:error] [mounts] reading file in zip archive: file already exists: C:\tasks\task_1551392763\mozharness\LICENSE P1 RESOLVED
1531927 [meta] windows/aarch64 - skipped/disabled media tests P5 RESOLVED
1533114 [meta] windows/aarch64 - skipped/disabled a11y tests P5 NEW
1533880 [meta] windows/aarch64 - skipped/disabled web-platform-tests P5 NEW
1534823 [meta] windows/aarch64 - skipped/disabled mochitest tests P5 NEW
1535467 windows/aarch64 - test screenshots sometimes show "Windows Defender Firewall has blocked some features of this app" P3 NEW
1536208 [meta] windows/aarch64 - skipped/disabled xpcshell tests P5 RESOLVED
1536283 [meta] windows/aarch64 - skipped/disabled marionette tests P5 RESOLVED
1536354 [meta] windows/aarch64 - skipped/disabled reftests P5 NEW
1538785 windows/aarch64 - plugin tests failing on windows10-aarch64 -- RESOLVED
1539693 windows/aarch64 - re-enable/adjust web-platform-tests results based on new timeout multiplier -- RESOLVED
1540213 windows/aarch64 - enable tests for windows10-aarch64 on taskgraph -- RESOLVED
1543521 windows/aarch64 - lower windows10-aarch64 to tier 2 on try -- RESOLVED
1545810 windows/aarch64 - web platform test chunk investigation -- RESOLVED
1546532 windows/aarch64 - enable mochitest-a11y -- RESOLVED
1546728 windows/aarch64 - enable cppunit -- RESOLVED
1546732 windows/aarch64 - enable jittest -- RESOLVED
1547820 windows/aarch64 - testing/web-platform/tests/media-source crashes on ARM64 -- RESOLVED
1552051 windows/aarch64 - run SM(p) instead of jittest P2 RESOLVED
1572185 Re-enable CSS web-platorm-tests for windows10-aarch64 -- RESOLVED

32 Total; 5 Open (15.63%); 27 Resolved (84.38%); 0 Verified (0%);