ReleaseEngineering/How To/Self Provision a TaskCluster Windows Instance

From MozillaWiki
< ReleaseEngineering‎ | How To
Revision as of 15:28, 1 June 2018 by Pmoore (talk | contribs) (Added info that loan only lasts 12 hours.)
Jump to navigation Jump to search

Template:Release Engineering

For generic-worker 10.5.0 onwards

Integration with taskcluster one-click loaner workflow will be done in bug 1368961.

Using this method, you can get an interactive RDP session to a live worker for 12 hours. After 12 hours, the worker will be killed. Please note, AWS may issue a Spot Termination at any time, so if you are very unlucky, the worker may die before the 12 hours are over.

1) Find a the task you want to play with in treeherder, and follow the link to the Taskcluster Task Inspector.
2) Go to Actions -> Edit Task.
3) Add rdpInfo to the payload section:

payload:
  rdpInfo: 'login-identity/<login-identity>/rdpinfo.txt'

For example:

payload:
  rdpInfo: 'login-identity/mozilla-auth0/ad|Mozilla-LDAP|pmoore/rdpinfo.txt'

(check https://tools.taskcluster.net/credentials to see what your login identity is, e.g. you should have the scope assume:login-identity:<login-identity>).

4) Check which workerType the task uses from the task definition, and then add the following scope to the list of task scopes:

scopes:
  - 'generic-worker:allow-rdp:aws-provisioner-v1/<workerType>'

For example:

scopes:
  - 'generic-worker:allow-rdp:aws-provisioner-v1/gecko-t-win7-32'

5) If you will require administrator privileges:

5.1) Ensure you have scope generic-worker:os-group:Administrators in https://tools.taskcluster.net/credentials/

If you do not have the scope, request it using a Service Request or by asking in #taskcluster IRC channel.

5.2) Add the following to the task payload:

scopes:
  - generic-worker:os-group:Administrators
payload:
  osGroups:
    - Administrators

6) Run the task, and when it starts, go to Run Artifacts to see the rdpInfo.txt file appear with rdp connection information.
7) Enter the connection information into your RDP client of choice.
8) Connect with screen resolution 1280x1024 ! Note, it is important to use this resolution for gecko tests, since this is the screen size used by the tests, and the screen size cannot change once you have made a connection.

Performing operations as Administrator

Until bug 1465374 is resolved this can be a little tricky.

1) Make sure you followed step 5 above!
2) Open a regular command shell (e.g. Start Menu -> Run -> cmd.exe)
3) From there check which users are in the Administrators group:

C:\Windows\System32>net localgroup Administrators
Alias name     Administrators
Comment        Administrators have complete and unrestricted access to the computer/domain

Members

-------------------------------------------------------------------------------
Administrator
task_1527672240
The command completed successfully.

4) Check the user you are logged in as is one of the above listed users:

C:\Windows\System32>whoami
i-015fe55bb8553\task_1527672240

5) Open a new UAC elevated command shell:

C:\Windows\System32>powershell.exe Start-Process cmd.exe -Verb runAs

6) This will require you enter Administrative credentials. You will be presented with a prompt similar to this:

Screenshot of dialogue box asking for Administrative credentials

Click on More choices and select the task user, and copy/paste the task password (Ctrl-V).

You should now have a command shell running as Administrator!

Rerunning tasks after they have completed

If you want to rerun a step after the task has completed, there is a script for each command in the task payload. They are named like this:

Z:\task_XXXXXXX\command_000000_wrapper.bat
Z:\task_XXXXXXX\command_000001_wrapper.bat 
Z:\task_XXXXXXX\command_000002_wrapper.bat 
....

Simply run them by hand to reproduce any of the task steps. If you wish to make changes, feel free to edit them. After the loan expires, the task directory, including all your changes, will be deleted. Therefore make sure to keep track of any changes that need to be put back in the gecko build system.

For earlier versions of generic-worker (<10.5.0)

How does it work?

TaskCluster Windows instances can be borrowed for debugging at will. There is no need to raise a bug. The process is triggered by running a special task that tells the instance you intend to commandeer it.

When you trigger this task, a recurring scheduled job on the instance will:

  • perform a little cleanup, deleting data and configuration that is only pertinent for automation
  • change the instance credentials, so that only you will have access to the instance and so that the instance credentials can be shared with you
  • place encrypted credentials in the task artifacts for you
  • delete the generic-worker and associated user and configuration

How is it triggered?

Visit https://tools.taskcluster.net/task-creator and create a new task that looks something like below.

 provisionerId: aws-provisioner-v1
 workerType: gecko-t-win7-32
 retries: 0
 created: '2017-05-24T12:05:16.556Z'
 deadline: '2017-05-24T13:05:16.556Z'
 expires: '2018-05-24T09:08:08.128Z'
 scopes: []
 payload:
   maxRunTime: 300
   artifacts:
     - path: ..\loan
       name: public/test_info
       type: directory
       expires: '2018-05-24T08:49:41.238Z'
   env:
     REQUESTER_EMAIL: grenade@mozilla.com
     REQUESTER_PUBLIC_KEY_URL: https://keybase.io/grenade/pgp_keys.asc?fingerprint=1c09ac24c113c7f080dd4aa5b3c5a958508a43f2
   command:
   - >-
     c:\mozilla-build\python\python.exe -c "import os, json;
     json.dump({'requester': { 'email': os.environ['REQUESTER_EMAIL'],
     'publickeyurl': os.environ['REQUESTER_PUBLIC_KEY_URL'], 'taskid':
     os.environ['TASK_ID'], 'taskFolder': os.getcwd()}},
     open(r'z:\loan-request.json', 'wb'))"
   - >-
     c:\mozilla-build\python\python.exe -c "exec(\"import os.path, time\nwhile
     (not os.path.exists(r'z:\loan\credentials.txt.gpg')):\n  time.sleep(1)\"")
 metadata:
   name: Windows Loan Request
   description: Self service windows instance loan request
   owner: grenade@mozilla.com
   source: https://wiki.mozilla.org/index.php?title=ReleaseEngineering/How_To/Self_Provision_a_TaskCluster_Windows_Instance
 tags: {}
 extra: {}
  • Change workerType to one of:
    • gecko-1-b-win2012: Windows Server 2012 r2 - Use for debugging Build issues
    • gecko-t-win10-64: Windows 10 Enterprise, x86_64 - Use for debugging Test issues
    • gecko-t-win10-64-gpu: Windows 10 Enterprise, x86_64 - Use for debugging Test issues where a GPU is required
    • gecko-t-win7-32: Windows 7 Enterprise, x86 - Use for debugging Test issues
    • gecko-t-win7-32-gpu: Windows 7 Enterprise, x86 - Use for debugging Test issues where a GPU is required
  • Change REQUESTER_EMAIL to the email address associated with your GPG public key
  • Change REQUESTER_PUBLIC_KEY_URL to a url that will return a plain text (raw) copy of you GPG public key.
  • Change owner to your own email address
  • Click the Update Timestamps button at the bottom
  • Click the Create Task button at the bottom
  • When the task completes, navigate to the artifacts tab (select run 0 first) and download the public/test_info/credentials.txt.gpg artifact to your computer.
  • Decrypt it with a command similar to:
 gpg2 --decrypt credentials.txt.gpg

How do I connect to my borrowed instance?

Remote Desktop is the only currently supported mechanism. Further, the processes that manage cleanup of abandoned or unused loaner instances check for RDP sessions to determine if the instance is actually in use or if it has been abandoned.

Connecting from Windows

  • GUI: Start > Remote Desktop Connection > enter the IP Address and credentials from the decrypted credentials.txt file and connect
  • Command line:
 mstsc /w:1024 /h:768 /v:<ip_address>

Connecting from Linux

Fedora

 # install xfreerdp
 sudo dnf install -y xfreerdp
 # if you have an American English (en-US) keyboard:
 xfreerdp /u:$username /p:"$password" /kbd:409 /w:1024 /h:768 +clipboard /v:$ip_address
 # if you have a Queen's English (en-GB) keyboard:
 xfreerdp /u:$username /p:"$password" /kbd:809 /w:1024 /h:768 +clipboard /v:$ip_address

For other keyboard layouts, see: https://github.com/FreeRDP/FreeRDP/blob/master/include/freerdp/locale/keyboard.h

Connecting from Mac OSX

Untested. Please edit this section if you know more.

   open rdp://$username:$password@$ip_address:3389?screendepth###24:screenWidth###1024:screenHeight###768

see also: https://apple.stackexchange.com/a/54925/112073

Microsoft also has a Remote Desktop app for macOS (available on the app store at https://itunes.apple.com/us/app/microsoft-remote-desktop-10/id1295203466?mt=12) that works pretty well and is easy to use.

How do I return it when I'm done?

There's no need. Your loaned instance is an Amazon EC2 spot instance whose lifetime is finite. It can even die while it's in the middle of working for you, if the spot price is outbid. Think of it as a disposable minion whose survival depends on being busy and who will expire when exhausted, bored, or will defect to a higher-paying villain at any time. If you don't create an RDP connection to your instance within 30 minutes of requesting it, it will die of boredom. If you disconnect your RDP session for more than 15 minutes, it will die of boredom. If you shut it down. It will die fulfilled. When working on the loaner, it would be advisable to save your intermediate results periodically and script your actions, so that if the loaner disappears you can quickly get back to where you were on a new loaner.

How do I get help if it's not working?

  • raise a bug in the RelOps component. cc or assign rthijssen@mozilla.com
  • ping :grenade in #taskcluster on IRC (during normal working hours: 9 - 5 GMT+3)