CI Automation/windows10 aarch64: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(how to run tests)
 
(35 intermediate revisions by 4 users not shown)
Line 1: Line 1:
=Overview=
=Overview=
Since mid-January 2019 the [https://wiki.mozilla.org/CI_Automation CI-A team] has been working to enable existing test harnesses, continuous integration tests and other tools to run on Windows 10 ARM64, aka aarch64.


With the announcement of the Qualcomm-Mozilla partnership came the need to release a version of Firefox for the ARM64 architecture.
= General Information =
 
Since mid-January 2019 the CI-A team has been working to enable existing test harnesses, continuous integration tests and other tools to run on Windows 10 ARM64.
 
=Information=


== Hardware ==
== Hardware ==
Line 18: Line 15:
== Hosting ==
== Hosting ==


Currently an array of 9 machines are hosted at [https://bitbar.com/ Bitbar] in the United States.
Currently an array of ~30 machines are hosted at [https://bitbar.com/ Bitbar] in the United States.
 
= Setup - local environment =
Developers wishing to run tests locally have two methods.
 
== Prequisites ==
 
# download and install [https://ftp.mozilla.org/pub/mozilla/libraries/win32/MozillaBuildSetup-2.2.0.exe Mozilla-Build 2.2.0]
 
=== Using mozilla-build ===
 
This method uses a script to download test archives in order to run tests locally.
 
# download <code>script for running mozharness on Yoga</code> from [https://bugzilla.mozilla.org/show_bug.cgi?id=1520867 bug 1520867]
# place the test runner script in the <code>C:\mozilla-build</code> directory
# from treeherder, identify a changeset that contains a successful <code>build-win64-aarch64/opt</code>
# copy the task ID of the build
# invoke start-shell.bat, which will launch a bash-like commandline
# from mozilla-build directory, run the test runner script as follows:
<code>bash script.sh task_id test_type <chunk_to_run> <total_chunks></code>
 
Example:
<code>bash script.sh Q-CE8DFvSAWmc08vw6bd6A xpcshell 1 8</code>
 
=== Using mozilla-central ===
 
This method is taken from [https://www.gijsk.com/blog/2019/02/getting-firefox-artifact-builds-working-on-an-arm64-aarch64-windows-device/ this guide] and uses mozilla-central with a build artifact.
 
# invoke start-shell.bat, which will launch a bash-like commandline
# clone the repository using <code>hg clone https://hg.mozilla.org/mozilla-central/</code>
# run <code>./mach bootstrap</code> and pick artifact build
# download python3 embeddable zip, then extract to <code>mozilla-build/</code> directory
# remove [https://searchfox.org/mozilla-central/rev/152993fa346c8fd9296e4cd6622234a664f53341/python/mozboot/mozboot/bootstrap.py#444 this line]
# download 32bit NodeJS zip and extract to <code>.mozbuild/node</code>
# inside mozilla-build, remove the directory named <code>watchman</code>
# rerun <code>./mach bootstrap</code>
# run <code>./mach build</code>
 
After the artifact build succeeds, it is possible to run most suites of tests as normal:
<code>./mach mochitest <test_file></code>
 
= CI environment =
 
Tests that are run in Taskcluster environment against windows10-aarch64 execute using [https://github.com/taskcluster/generic-worker Taskcluster Generic-Worker]. These are installed as a service on via [https://github.com/mozilla-releng/OpenCloudConfig OpenCloudConfig].


= Setup =
== Using OpenCloudConfig ==


Tests that are run against windows10-aarch64 execute using [https://github.com/taskcluster/generic-worker Taskcluster Generic-Worker]. These are installed as a service on the Windows 10 ARM64 manually or via [https://github.com/mozilla-releng/OpenCloudConfig OpenCloudConfig].
This is the method used in production.


A brief walkthrough of the steps to have Taskcluster Generic-Worker running on Windows 10 ARM64 will be provided.
Steps originally taken from [https://bugzilla.mozilla.org/show_bug.cgi?id=1520432#c2 1520432].


== Using only generic-worker ==
$gitBranchOrRef = 'master'
Invoke-Expression (New-Object Net.WebClient).DownloadString(('https://raw.githubusercontent.com/mozilla-releng/OpenCloudConfig/{0}/userdata/rundsc.ps1?{1}' -f $gitBranchOrRef, [Guid]::NewGuid()))


Follow this step to install Taskcluster Generic-Worker on the hardware, and have it launch as a service. After following these steps, the hardware should be ready to accept any tasks started on Taskcluster.  
== Manually install Generic-Worker [Not recommended] ==
 
Follow these step to install Taskcluster Generic-Worker on the hardware, and have it launch as a service.


Instruction originally from [https://bugzilla.mozilla.org/show_bug.cgi?id=1522997#c2 1522997].
Instruction originally from [https://bugzilla.mozilla.org/show_bug.cgi?id=1522997#c2 1522997].


=== Prerequisites ===
'''Prerequisites'''
* disable Windows S mode
* disable Windows S mode
* disable User Account Control
* disable User Account Control
Line 42: Line 85:
* request scope `assume:project:taskcluster:generic-worker-tester`  
* request scope `assume:project:taskcluster:generic-worker-tester`  


=== Steps ===
'''Steps'''
 
# download the current 386 release of `generic-worker-windows-386.exe` from [https://github.com/taskcluster/generic-worker/releases taskcluster generic-worker].
# download the current 386 release of `generic-worker-windows-386.exe` from [https://github.com/taskcluster/generic-worker/releases taskcluster generic-worker].
# download the latest 386 version of livelog.exe and taskcluster-proxy.exe.
# download the latest 386 version of livelog.exe and taskcluster-proxy.exe.
Line 57: Line 99:
  "ed25519SigningKeyLocation":  "<file location you wrote ed25519 private key in step 6>",
  "ed25519SigningKeyLocation":  "<file location you wrote ed25519 private key in step 6>",
  "livelogSecret":              "<any text>",
  "livelogSecret":              "<any text>",
"openpgpSigningKeyLocation":  "<file location you wrote gpg private key kn step 6>",
  "provisionerId":              "test-provisioner",
  "provisionerId":              "test-provisioner",
  "publicIP":                  "<ideally an IP address of one of your network interfaces>",
  "publicIP":                  "<ideally an IP address of one of your network interfaces>",
Line 71: Line 112:
# sc query "Generic Worker"
# sc query "Generic Worker"


== Using OpenCloudConfig ==
= Currently running on CI =
 
Currently, a limit subset of tests are running regularly on <code>mozilla-central</code> and <code>try</code>. This is to reduce the load on the windows10-aarch64 hardware, which is limited in number.
 
= Run on try =
 
This is probably what you came to the document for. How to run tests against the windows10-aarch64 hardware currently available.
 
'''Hardware is limited so please exercise caution when scheduling tests! A careless try will block many others. Only schedule jobs that are absolutely necessary.'''
 
== Prerequisites ==
 
* try access (commit access level 1)
* up-to-date mozilla-central codebase
 
== Steps ==
 
Note that on <code>try</code>, windows10-aarch64 is hidden by default; please use <code>./mach try fuzzy --full</code> to schedule jobs.
 
# <code>./mach try fuzzy --full</code>
# select tests that need to be run (e.g. 'windows10-aarch64 xpcshell')
# enter
 
Tests will appear in Treeherder under the heading ''Windows 10 AArch64 opt''.
 
= Greening tests =
 
Since Windows on ARM64 is a new platform/architecture combination, failures unique to this combination is to be expected. It will be necessary to fix, correct or update the tests in order to obtain a green run.
 
== Example 1 ==


This is the method that is used in production.
As part of [https://bugzilla.mozilla.org/show_bug.cgi?id=1525743 1525743], the timeout for mochitest-browser-chrome was extended to 4x the default value if the platform combination of Windows and ARM64 is detected.


Steps originally taken from [https://bugzilla.mozilla.org/show_bug.cgi?id=1520432#c2 1520432].
See change: https://phabricator.services.mozilla.com/D19882


# Invoke-Expression (New-Object Net.WebClient).DownloadString('https://raw.githubusercontent.com/mozilla-releng/OpenCloudConfig/aarch64/userdata/rundsc.ps1')
This change greened the test that was previously failing due to a timeout.


= Currently Running =
== Example 2 ==


Currently supported list of tests include:
Some tests provide a manifest file in the form of <test_category>.ini, such as ''mochitest.ini''.


* awsy
For [https://bugzilla.mozilla.org/show_bug.cgi?id=1525665 bug 1525665] it was determined to disable a certain a11y test while windows10-aarch64 a11y support was being investigated.
* mochitest (all flavors, including e10s)
* web-platform-tests (all flavors)
* reftests


Supported, requires non-artifact build:
See change: https://phabricator.services.mozilla.com/D22363


* jittest
This change meant the failing test is now disabled for windows10-aarch64, and the test would have been green had it not been for another failure elsewhere.
* gtests
* cppunittest


For an up-to-date list of tests, please refer to [https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-platforms.yml#222 this file].
== Example 3 ==


= Run tests =
Another example of manipulating the manifest of a category of tests, this time with ''web-platform-tests''.


== Overview ==
For [https://bugzilla.mozilla.org/show_bug.cgi?id=1533912 bug 1533912], the manifest was modified to disable the test if it was running on aarch64 hardware.


This is probably what you came to the document for. How to run tests against the windows10-aarch64 hardware we currently have running.
See change: https://phabricator.services.mozilla.com/D23003


Follow these steps to be able to enable windows10-aarch64 tests for the try server. These steps are required as of 2019-02-25; it will become obsolete when windows10-aarch64 tests are released to the general public.
Note that web-platform-tests use a slightly different format in order.


=== Prerequisites ===
== Example 4 ==


* try access
Certain test cases in reftest/crashtest/jsreftest had unexpected outcomes on windows10-aarch64.
* up-to-date mozilla-central codebase


=== Steps ===
For [https://bugzilla.mozilla.org/show_bug.cgi?id=1536365 bug 1536365] and [https://bugzilla.mozilla.org/show_bug.cgi?id=1536363 bug 1536363], the requirement was to adjust the pixel-difference values such that tests will pass.


# open the file at taskcluster/ci/test/test-platforms.yml
See change: https://phabricator.services.mozilla.com/D25113
# search for 'windows10-aarch64/opt'
# uncomment all or some of the items under 'test-sets'
# make changes to the local codebase that needs testing
# ./mach try fuzzy
# select tests that need to be run
# enter


= Bugs =
= Bugs =


These are the top-level tracking bugs; the recommended view is [https://bugzilla.mozilla.org/showdependencytree.cgi?id=1522997&hide_resolved=0 tree] (login required).
These are the top-level tracking bugs; the recommended view is [https://bugzilla.mozilla.org/showdependencytree.cgi?id=1522997&hide_resolved=0 tree] (login required).
CI-A team will make efforts to re-test disabled tests on a semi-regular basis, or whenever fixes are committed to components that had tests disabled.


<bugzilla>
<bugzilla>

Latest revision as of 21:35, 3 April 2020

Overview

Since mid-January 2019 the CI-A team has been working to enable existing test harnesses, continuous integration tests and other tools to run on Windows 10 ARM64, aka aarch64.

General Information

Hardware

  • Make: Lenovo
  • Model: C630 YOGA
  • Processor: Qualcomm Snapdragon 850 3.0GHz
  • Cores: 8
  • Memory: 8GB
  • Disk: 128GB SSD

Hosting

Currently an array of ~30 machines are hosted at Bitbar in the United States.

Setup - local environment

Developers wishing to run tests locally have two methods.

Prequisites

  1. download and install Mozilla-Build 2.2.0

Using mozilla-build

This method uses a script to download test archives in order to run tests locally.

  1. download script for running mozharness on Yoga from bug 1520867
  2. place the test runner script in the C:\mozilla-build directory
  3. from treeherder, identify a changeset that contains a successful build-win64-aarch64/opt
  4. copy the task ID of the build
  5. invoke start-shell.bat, which will launch a bash-like commandline
  6. from mozilla-build directory, run the test runner script as follows:

bash script.sh task_id test_type <chunk_to_run> <total_chunks>

Example: bash script.sh Q-CE8DFvSAWmc08vw6bd6A xpcshell 1 8

Using mozilla-central

This method is taken from this guide and uses mozilla-central with a build artifact.

  1. invoke start-shell.bat, which will launch a bash-like commandline
  2. clone the repository using hg clone https://hg.mozilla.org/mozilla-central/
  3. run ./mach bootstrap and pick artifact build
  4. download python3 embeddable zip, then extract to mozilla-build/ directory
  5. remove this line
  6. download 32bit NodeJS zip and extract to .mozbuild/node
  7. inside mozilla-build, remove the directory named watchman
  8. rerun ./mach bootstrap
  9. run ./mach build

After the artifact build succeeds, it is possible to run most suites of tests as normal: ./mach mochitest <test_file>

CI environment

Tests that are run in Taskcluster environment against windows10-aarch64 execute using Taskcluster Generic-Worker. These are installed as a service on via OpenCloudConfig.

Using OpenCloudConfig

This is the method used in production.

Steps originally taken from 1520432.

$gitBranchOrRef = 'master'
Invoke-Expression (New-Object Net.WebClient).DownloadString(('https://raw.githubusercontent.com/mozilla-releng/OpenCloudConfig/{0}/userdata/rundsc.ps1?{1}' -f $gitBranchOrRef, [Guid]::NewGuid()))

Manually install Generic-Worker [Not recommended]

Follow these step to install Taskcluster Generic-Worker on the hardware, and have it launch as a service.

Instruction originally from 1522997.

Prerequisites

  • disable Windows S mode
  • disable User Account Control
  • disable Windows Firewall
  • download NSSM to C:\nssm-2.24\
  • create "Remote Desktop Users" group:
net localgroup "Remote Desktop Users" /add
  • log in to Taskcluster
  • request scope `assume:project:taskcluster:generic-worker-tester`

Steps

  1. download the current 386 release of `generic-worker-windows-386.exe` from taskcluster generic-worker.
  2. download the latest 386 version of livelog.exe and taskcluster-proxy.exe.
  3. create new directory C:\generic-worker.
  4. move the three executable files under C:\generic-worker.
  5. rename generic-worker-windows-386.exe to generic-worker.exe.
  6. generate two signing keys:
generic-worker new-openpgp-keypair --file <unique_file_name>
generic-worker new-ed25519-keypair --file <unique_file_name>
  1. create generic-worker.config and include the following:
"accessToken":                "<access token tied to taskcluster>",
"clientId":                   "<client ID tied to taskcluster>",
"ed25519SigningKeyLocation":  "<file location you wrote ed25519 private key in step 6>",
"livelogSecret":              "<any text>",
"provisionerId":              "test-provisioner",
"publicIP":                   "<ideally an IP address of one of your network interfaces>",
"rootURL":                    "https://taskcluster.net",
"workerGroup":                "test-worker-group",
"workerId":                   "test-worker-id",
"workerType":                 "<a unique string that only you will use for your test worker(s)>"
  1. launch cmd.exe with Administrator rights.
  2. cd c:\generic-worker
  3. generic-worker.exe install service --config generic-worker.config --nssm c:\nssm-2.24\win32\nssm.exe
  4. reboot once installed.
  5. launch cmd.exe with Administrator rights.
  6. sc query "Generic Worker"

Currently running on CI

Currently, a limit subset of tests are running regularly on mozilla-central and try. This is to reduce the load on the windows10-aarch64 hardware, which is limited in number.

Run on try

This is probably what you came to the document for. How to run tests against the windows10-aarch64 hardware currently available.

Hardware is limited so please exercise caution when scheduling tests! A careless try will block many others. Only schedule jobs that are absolutely necessary.

Prerequisites

  • try access (commit access level 1)
  • up-to-date mozilla-central codebase

Steps

Note that on try, windows10-aarch64 is hidden by default; please use ./mach try fuzzy --full to schedule jobs.

  1. ./mach try fuzzy --full
  2. select tests that need to be run (e.g. 'windows10-aarch64 xpcshell')
  3. enter

Tests will appear in Treeherder under the heading Windows 10 AArch64 opt.

Greening tests

Since Windows on ARM64 is a new platform/architecture combination, failures unique to this combination is to be expected. It will be necessary to fix, correct or update the tests in order to obtain a green run.

Example 1

As part of 1525743, the timeout for mochitest-browser-chrome was extended to 4x the default value if the platform combination of Windows and ARM64 is detected.

See change: https://phabricator.services.mozilla.com/D19882

This change greened the test that was previously failing due to a timeout.

Example 2

Some tests provide a manifest file in the form of <test_category>.ini, such as mochitest.ini.

For bug 1525665 it was determined to disable a certain a11y test while windows10-aarch64 a11y support was being investigated.

See change: https://phabricator.services.mozilla.com/D22363

This change meant the failing test is now disabled for windows10-aarch64, and the test would have been green had it not been for another failure elsewhere.

Example 3

Another example of manipulating the manifest of a category of tests, this time with web-platform-tests.

For bug 1533912, the manifest was modified to disable the test if it was running on aarch64 hardware.

See change: https://phabricator.services.mozilla.com/D23003

Note that web-platform-tests use a slightly different format in order.

Example 4

Certain test cases in reftest/crashtest/jsreftest had unexpected outcomes on windows10-aarch64.

For bug 1536365 and bug 1536363, the requirement was to adjust the pixel-difference values such that tests will pass.

See change: https://phabricator.services.mozilla.com/D25113

Bugs

These are the top-level tracking bugs; the recommended view is tree (login required).

CI-A team will make efforts to re-test disabled tests on a semi-regular basis, or whenever fixes are committed to components that had tests disabled.

Full Query
ID Summary Priority Status
1520867 Investigate running tests on Windows / arm64 P1 RESOLVED
1523722 Run gtest using generic-worker on Windows/aarch64 P3 RESOLVED
1524114 Run xpcshell-test using generic-worker on Windows/aarch64 P3 RESOLVED
1524400 Run mochitest using generic-worker on windows/aarch64 P3 RESOLVED
1524410 Run reftest suites using generic-worker on windows/aarch64 P3 RESOLVED
1525118 [meta] Run taskcluster task from mach try on Bitbar -- RESOLVED
1525434 Run web-platform-test suite using generic-worker on windows/aarch64 P3 RESOLVED
1526015 Run cppunit, jittest, marionette using generic-worker on Windows/aarch64 P3 RESOLVED
1527177 Intermittent [taskcluster:error] [mounts] reading file in zip archive: file already exists: Z:\task_1549919043\mozharness\LICENSE P5 RESOLVED
1527469 Enable windows10-aarch64 build and tests on try server -- RESOLVED
1530737 unable to run talos/raptor on win/aarch64 builds in CI -- RESOLVED
1531876 run talos/raptor tests on windows10 aarch64 laptops P1 RESOLVED
1531878 [taskcluster:error] [mounts] reading file in zip archive: file already exists: C:\tasks\task_1551392763\mozharness\LICENSE P1 RESOLVED
1531927 [meta] windows/aarch64 - skipped/disabled media tests P5 RESOLVED
1533114 [meta] windows/aarch64 - skipped/disabled a11y tests P5 NEW
1533880 [meta] windows/aarch64 - skipped/disabled web-platform-tests P5 NEW
1534823 [meta] windows/aarch64 - skipped/disabled mochitest tests P5 NEW
1535467 windows/aarch64 - test screenshots sometimes show "Windows Defender Firewall has blocked some features of this app" P3 NEW
1536208 [meta] windows/aarch64 - skipped/disabled xpcshell tests P5 RESOLVED
1536283 [meta] windows/aarch64 - skipped/disabled marionette tests P5 RESOLVED
1536354 [meta] windows/aarch64 - skipped/disabled reftests P5 NEW
1538785 windows/aarch64 - plugin tests failing on windows10-aarch64 -- RESOLVED
1539693 windows/aarch64 - re-enable/adjust web-platform-tests results based on new timeout multiplier -- RESOLVED
1540213 windows/aarch64 - enable tests for windows10-aarch64 on taskgraph -- RESOLVED
1543521 windows/aarch64 - lower windows10-aarch64 to tier 2 on try -- RESOLVED
1545810 windows/aarch64 - web platform test chunk investigation -- RESOLVED
1546532 windows/aarch64 - enable mochitest-a11y -- RESOLVED
1546728 windows/aarch64 - enable cppunit -- RESOLVED
1546732 windows/aarch64 - enable jittest -- RESOLVED
1547820 windows/aarch64 - testing/web-platform/tests/media-source crashes on ARM64 -- RESOLVED
1552051 windows/aarch64 - run SM(p) instead of jittest P2 RESOLVED
1572185 Re-enable CSS web-platorm-tests for windows10-aarch64 -- RESOLVED

32 Total; 5 Open (15.63%); 27 Resolved (84.38%); 0 Verified (0%);