ReleaseEngineering/How To/Self Provision a TaskCluster Windows Instance: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Added information about rerunning task steps)
(Updated to reflect move to taskcluster)
 
(8 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{Release Engineering|How To|Self Provision a TaskCluster Windows Instance}}
{{Release Engineering How To|Self Provision a TaskCluster Windows Instance}}


__TOC__
__TOC__
Line 7: Line 7:
'''Integration with taskcluster one-click loaner workflow will be done in [https://bugzil.la/1368961 bug 1368961].'''
'''Integration with taskcluster one-click loaner workflow will be done in [https://bugzil.la/1368961 bug 1368961].'''


1) Find a the task you want to play with in treeherder, and follow the link to the Taskcluster Task Inspector.<br />
1) If you are connecting to a hardware worker in our data center (provisionerId: "releng-hardware") such as a MoonShot machine, make sure to
2) Go to ''Actions -> Edit Task''.<br />
announce in #ci and/or create a bug under [https://bugzilla.mozilla.org/enter_bug.cgi?product=Infrastructure%20%26%20Operations&component=CIDuty CiDuty - Buzilla] announcing the fact that you will take a loaner.
3) Add <code>rdpInfo</code> to the payload section:
 
2) Find a the task you want to play with in treeherder, and follow the link to the Taskcluster Task Inspector.<br />
3) Go to ''Actions -> Edit Task''.<br />
4) Add <code>rdpInfo</code> to the payload section:


  payload:
  payload:
Line 19: Line 22:
   rdpInfo: 'login-identity/mozilla-auth0/ad|Mozilla-LDAP|pmoore/rdpinfo.txt'
   rdpInfo: 'login-identity/mozilla-auth0/ad|Mozilla-LDAP|pmoore/rdpinfo.txt'


(check https://tools.taskcluster.net/credentials to see what your login identity is, e.g. you should have the scope <code>assume:login-identity:<login-identity></code>).
(check https://firefox-ci-tc.services.mozilla.com/profile to see what your login identity is, e.g. you should have the scope <code>assume:login-identity:<login-identity></code>).


4) Check which <code>workerType</code> the task uses from the task definition, and then add the following scope to the list of task scopes:
4) Check which <code>workerType</code> the task uses from the task definition, and then add the following scope to the list of task scopes:


  scopes:
  scopes:
   - 'generic-worker:allow-rdp:aws-provisioner-v1/<workerType>'
   - 'generic-worker:allow-rdp:gecko-t/<workerType>'


For example:
For example:


  scopes:
  scopes:
   - 'generic-worker:allow-rdp:aws-provisioner-v1/gecko-t-win7-32'
   - 'generic-worker:allow-rdp:gecko-t/t-win7-32-gpu'


5) Run the task, and when it starts, go to ''Run Artifacts'' to see the <code>rdpInfo.txt</code> file appear with rdp connection information.<br />
5) If you will require administrator privileges:
6) Enter the connection information into your RDP client of choice.<br />
7) Connect with screen resolution 1280x1024 ! Note, it is '''important''' to use this resolution for gecko tests, since this is the screen size used by the tests, and the screen size cannot change once you have made a connection.


== Rerunning tasks after they have completed ==
5.1) Ensure you have scope <code>generic-worker:os-group:<provisionerId>/<workerType>/Administrators</code> in https://tools.taskcluster.net/credentials/


If you want to rerun a step after the task has completed, there is a script for each command in the task payload. They are named like this:
If you do not have the scope, request it using a [https://bugzilla.mozilla.org/enter_bug.cgi?product=Taskcluster&component=Service%20Request Service Request] or by asking in <code>#taskcluster</code> IRC channel.  


Z:\task_XXXXXXX\command_000000_wrapper.bat
5.2) Add the following to the task payload:
Z:\task_XXXXXXX\command_000001_wrapper.bat
Z:\task_XXXXXXX\command_000002_wrapper.bat
....


Simply run them by hand to reproduce any of the task steps. If you wish to make changes, feel free to edit them. After the loan expires, the task directory, including all your changes, will be deleted. Therefore make sure to keep track of any changes that need to be put back in the gecko build system.
scopes:
  - generic-worker:os-group:<provisionerId>/<workerType>/Administrators
payload:
  osGroups:
    - Administrators


= For earlier versions of generic-worker (<10.5.0) =
6) Make sure that if your payload mounts any artifacts, that the task ID(s) are included as a list in top level property <code>dependencies</code> ('''not''' under <code>payload</code>)


== How does it work? ==
So if it says


TaskCluster Windows instances can be borrowed for debugging at will. There is no need to raise a bug. The process is triggered by running a special task that tells the instance you intend to commandeer it.
  mounts:
    - directory: .
      content:
        taskId: HUqhbXTMTXSXYEM6K2H3wA


When you trigger this task, a recurring scheduled job on the instance will:
Add (at the top level indent)
* perform a little cleanup, deleting data and configuration that is only pertinent for automation
* change the instance credentials, so that only you will have access to the instance and so that the instance credentials can be shared with you
* place encrypted credentials in the task artifacts for you
* delete the generic-worker and associated user and configuration


== How is it triggered? ==
dependencies:
  - HUqhbXTMTXSXYEM6K2H3wA


Visit https://tools.taskcluster.net/task-creator and create a new task that looks something like below.
7) Submit the task. If the task resolves as exception and doesn't run, '''check the task log file'''. It should provide more information about the cause.<br />
8) When the task successfully starts, go to ''Run Artifacts'' to see the <code>rdpInfo.txt</code> file appear with rdp connection information.<br />
9) Enter the connection information into your RDP client of choice.<br />
10) Connect with screen resolution 1280x1024 ! Note, it is '''important''' to use this resolution for gecko tests, since this is the screen size used by the tests, and the screen size cannot change once you have made a connection.


  provisionerId: aws-provisioner-v1
== Performing operations as Administrator ==
  workerType: gecko-t-win7-32
  retries: 0
  created: '2017-05-24T12:05:16.556Z'
  deadline: '2017-05-24T13:05:16.556Z'
  expires: '2018-05-24T09:08:08.128Z'
  scopes: []
  payload:
    maxRunTime: 300
    artifacts:
      - path: ..\loan
        name: public/test_info
        type: directory
        expires: '2018-05-24T08:49:41.238Z'
    env:
      REQUESTER_EMAIL: grenade@mozilla.com
      REQUESTER_PUBLIC_KEY_URL: https://keybase.io/grenade/pgp_keys.asc?fingerprint=1c09ac24c113c7f080dd4aa5b3c5a958508a43f2
    command:
    - >-
      c:\mozilla-build\python\python.exe -c "import os, json;
      json.dump({'requester': { 'email': os.environ['REQUESTER_EMAIL'],
      'publickeyurl': os.environ['REQUESTER_PUBLIC_KEY_URL'], 'taskid':
      os.environ['TASK_ID'], 'taskFolder': os.getcwd()}},
      open(r'z:\loan-request.json', 'wb'))"
    - >-
      c:\mozilla-build\python\python.exe -c "exec(\"import os.path, time\nwhile
      (not os.path.exists(r'z:\loan\credentials.txt.gpg')):\n  time.sleep(1)\"")
  metadata:
    name: Windows Loan Request
    description: Self service windows instance loan request
    owner: grenade@mozilla.com
    source: https://wiki.mozilla.org/index.php?title=ReleaseEngineering/How_To/Self_Provision_a_TaskCluster_Windows_Instance
  tags: {}
  extra: {}


* Change '''workerType''' to one of:
Until [https://bugzil.la/1465374 bug 1465374] is resolved this can be a little tricky.
** '''gecko-1-b-win2012''': Windows Server 2012 r2 - Use for debugging '''Build''' issues
** '''gecko-t-win10-64''': Windows 10 Enterprise, x86_64 - Use for debugging '''Test''' issues
** '''gecko-t-win10-64-gpu''': Windows 10 Enterprise, x86_64 - Use for debugging '''Test''' issues where a GPU is required
** '''gecko-t-win7-32''': Windows 7 Enterprise, x86 - Use for debugging '''Test''' issues
** '''gecko-t-win7-32-gpu''': Windows 7 Enterprise, x86 - Use for debugging '''Test''' issues where a GPU is required
* Change '''REQUESTER_EMAIL''' to the email address associated with your GPG public key
* Change '''REQUESTER_PUBLIC_KEY_URL''' to a url that will return a plain text (raw) copy of you GPG public key.
* Change '''owner''' to your own email address
* Click the '''Update Timestamps''' button at the bottom
* Click the '''Create Task''' button at the bottom
* When the task completes, navigate to the artifacts tab (select run 0 first) and download the '''public/test_info/credentials.txt.gpg''' artifact to your computer.
* Decrypt it with a command similar to:


  gpg2 --decrypt credentials.txt.gpg
1) Make sure you followed step 5 above!<br />
2) Open a regular command shell (e.g. ''Start Menu'' -> ''Run'' -> <code>cmd.exe</code>)<br />
3) From there check which users are in the Administrators group:


== How do I connect to my borrowed instance? ==
C:\Windows\System32>net localgroup Administrators
Alias name    Administrators
Comment        Administrators have complete and unrestricted access to the computer/domain
Members
-------------------------------------------------------------------------------
Administrator
task_1527672240
The command completed successfully.
4) Check the user you are logged in as is one of the above listed users:
C:\Windows\System32>whoami
i-015fe55bb8553\task_1527672240


Remote Desktop is the only currently supported mechanism. Further, the processes that manage cleanup of abandoned or unused loaner instances check for RDP sessions to determine if the instance is actually in use or if it has been abandoned.
5) Open a new UAC elevated command shell:


=== Connecting from Windows ===
C:\Windows\System32>powershell.exe Start-Process cmd.exe -Verb runAs


* GUI: Start > Remote Desktop Connection > enter the IP Address and credentials from the decrypted credentials.txt file and connect
6) This will require you enter Administrative credentials. You will be presented with a prompt similar to this:
* Command line:


  mstsc /w:1024 /h:768 /v:<ip_address>
[[File:Screen Shot 2018-05-30 at 12.13.51.png|none|Screenshot of dialogue box asking for Administrative credentials]]


=== Connecting from Linux ===
Click on ''More choices'' and select the task user, and copy/paste the task password (<code>Ctrl-V</code>).
==== Fedora ====
  # install xfreerdp
  sudo dnf install -y xfreerdp


  # if you have an American English (en-US) keyboard:
You should now have a command shell running as Administrator!
  xfreerdp /u:$username /p:"$password" /kbd:409 /w:1024 /h:768 +clipboard /v:$ip_address


  # if you have a Queen's English (en-GB) keyboard:
== Rerunning tasks after they have completed ==
  xfreerdp /u:$username /p:"$password" /kbd:809 /w:1024 /h:768 +clipboard /v:$ip_address


For other keyboard layouts, see: https://github.com/FreeRDP/FreeRDP/blob/master/include/freerdp/locale/keyboard.h
If you want to rerun a step after the task has completed, there is a script for each command in the task payload. They are named like this:


=== Connecting from Mac OSX ===
Z:\task_XXXXXXX\command_000000_wrapper.bat
Untested. Please edit this section if you know more.
Z:\task_XXXXXXX\command_000001_wrapper.bat
Z:\task_XXXXXXX\command_000002_wrapper.bat
....


    open rdp://$username:$password@$ip_address:3389?screendepth###24:screenWidth###1024:screenHeight###768
Simply run them by hand to reproduce any of the task steps. If you wish to make changes, feel free to edit them. After the loan expires, the task directory, including all your changes, will be deleted. Therefore make sure to keep track of any changes that need to be put back in the gecko build system.
 
see also: https://apple.stackexchange.com/a/54925/112073
 
Microsoft also has a Remote Desktop app for macOS (available on the app store at https://itunes.apple.com/us/app/microsoft-remote-desktop-10/id1295203466?mt=12) that works pretty well and is easy to use.
 
== How do I return it when I'm done? ==
 
There's no need. Your loaned instance is an Amazon EC2 spot instance whose lifetime is finite. It can even die while it's in the middle of working for you, if the spot price is outbid. Think of it as a disposable minion whose survival depends on being busy and who will expire when exhausted,  bored, or will defect to a higher-paying villain at any time. If you don't create an RDP connection to your instance within 30 minutes of requesting it, it will die of boredom. If you disconnect your RDP session for more than 15 minutes, it will die of boredom. If you shut it down. It will die fulfilled. When working on the loaner, it would be advisable to save your intermediate results periodically and script your actions, so that if the loaner disappears you can quickly get back to where you were on a new loaner.
 
== How do I get help if it's not working? ==
 
* raise a bug in the [https://bugzilla.mozilla.org/enter_bug.cgi?product=Infrastructure%20%26%20Operations&component=RelOps RelOps component]. cc or assign rthijssen@mozilla.com
* ping :grenade in #taskcluster on IRC (during normal working hours: 9 - 5 GMT+3)

Latest revision as of 17:36, 12 June 2020


For generic-worker 10.5.0 onwards

Integration with taskcluster one-click loaner workflow will be done in bug 1368961.

1) If you are connecting to a hardware worker in our data center (provisionerId: "releng-hardware") such as a MoonShot machine, make sure to announce in #ci and/or create a bug under CiDuty - Buzilla announcing the fact that you will take a loaner.

2) Find a the task you want to play with in treeherder, and follow the link to the Taskcluster Task Inspector.
3) Go to Actions -> Edit Task.
4) Add rdpInfo to the payload section:

payload:
  rdpInfo: 'login-identity/<login-identity>/rdpinfo.txt'

For example:

payload:
  rdpInfo: 'login-identity/mozilla-auth0/ad|Mozilla-LDAP|pmoore/rdpinfo.txt'

(check https://firefox-ci-tc.services.mozilla.com/profile to see what your login identity is, e.g. you should have the scope assume:login-identity:<login-identity>).

4) Check which workerType the task uses from the task definition, and then add the following scope to the list of task scopes:

scopes:
  - 'generic-worker:allow-rdp:gecko-t/<workerType>'

For example:

scopes:
  - 'generic-worker:allow-rdp:gecko-t/t-win7-32-gpu'

5) If you will require administrator privileges:

5.1) Ensure you have scope generic-worker:os-group:<provisionerId>/<workerType>/Administrators in https://tools.taskcluster.net/credentials/

If you do not have the scope, request it using a Service Request or by asking in #taskcluster IRC channel.

5.2) Add the following to the task payload:

scopes:
  - generic-worker:os-group:<provisionerId>/<workerType>/Administrators
payload:
  osGroups:
    - Administrators

6) Make sure that if your payload mounts any artifacts, that the task ID(s) are included as a list in top level property dependencies (not under payload)

So if it says

 mounts:
   - directory: .
     content:
       taskId: HUqhbXTMTXSXYEM6K2H3wA

Add (at the top level indent)

dependencies:
  - HUqhbXTMTXSXYEM6K2H3wA

7) Submit the task. If the task resolves as exception and doesn't run, check the task log file. It should provide more information about the cause.
8) When the task successfully starts, go to Run Artifacts to see the rdpInfo.txt file appear with rdp connection information.
9) Enter the connection information into your RDP client of choice.
10) Connect with screen resolution 1280x1024 ! Note, it is important to use this resolution for gecko tests, since this is the screen size used by the tests, and the screen size cannot change once you have made a connection.

Performing operations as Administrator

Until bug 1465374 is resolved this can be a little tricky.

1) Make sure you followed step 5 above!
2) Open a regular command shell (e.g. Start Menu -> Run -> cmd.exe)
3) From there check which users are in the Administrators group:

C:\Windows\System32>net localgroup Administrators
Alias name     Administrators
Comment        Administrators have complete and unrestricted access to the computer/domain

Members

-------------------------------------------------------------------------------
Administrator
task_1527672240
The command completed successfully.

4) Check the user you are logged in as is one of the above listed users:

C:\Windows\System32>whoami
i-015fe55bb8553\task_1527672240

5) Open a new UAC elevated command shell:

C:\Windows\System32>powershell.exe Start-Process cmd.exe -Verb runAs

6) This will require you enter Administrative credentials. You will be presented with a prompt similar to this:

Screenshot of dialogue box asking for Administrative credentials

Click on More choices and select the task user, and copy/paste the task password (Ctrl-V).

You should now have a command shell running as Administrator!

Rerunning tasks after they have completed

If you want to rerun a step after the task has completed, there is a script for each command in the task payload. They are named like this:

Z:\task_XXXXXXX\command_000000_wrapper.bat
Z:\task_XXXXXXX\command_000001_wrapper.bat 
Z:\task_XXXXXXX\command_000002_wrapper.bat 
....

Simply run them by hand to reproduce any of the task steps. If you wish to make changes, feel free to edit them. After the loan expires, the task directory, including all your changes, will be deleted. Therefore make sure to keep track of any changes that need to be put back in the gecko build system.