Roadmap

Q1 2013 roadmap

In Q1 2013 the Identity DevOps team will be moving services out of the physical datacenter SCL2 and into Amazon Web Services.

2/4

1/28 - 2/4 : roadmap defined and signed off
1/28 - 2/4 : technology stack justification written and shared

2/11

2/4 - 2/11 : chef server built and working
2/4 - 2/11 : established 1 region VPC

2/18

2/4 - 2/18 : completed a mini provisioning test and plan
2/11 - 2/18 : written the webhead chef provisioning logic
2/11 - 2/18 : written the nginx chef provisioning logic and carried over existing nginx routing logic
milestone : chef can fully provision webheads

2/25

2/18 - 2/25 : zeus routing logic is converted into nginx logic
2/18 - 2/25 : written the nagios chef provisioning logic
2/18 - 2/25 : basic webhead nagios checks created
milestone : admin can see the monitored availability and performance of the webhead

3/4

2/25 - 3/4 : ELB is setup and sending traffic to the webhead
2/18 - 2/25 : basic webhead nagios checks against the ELB created
milestone : internet client can fetch persona main page from AWS traversing ELB

3/11

3/4 - 3/11 : written swebhead chef provisioning logic
3/4 - 3/11 : ELB configured for swebhead cluster
3/4 - 3/11 : written db chef provisioning logic
3/4 - 3/11 : ELB configured for db cluster
3/4 - 3/11 : written keysign chef provisioning logic
3/4 - 3/11 : ELB configured for keysign cluster
3/4 - 3/11 : established VPN to PHX1
3/4 - 3/11 : basic swebhead db and keysign nagios checks created
milestone : internet client can login using persona in AWS

3/18

3/11 - 3/18 : written bigtent and squid proxy chef provisioning logic
3/11 - 3/18 : ELB configured for bigtent and squid clusters
3/11 - 3/18 : basic bigtent and squid proxy nagios checks created
milestone : internet client can login with a yahoo address using yahoo bigtent

3/25

3/18 - 3/25 : load tested/validated that region 1 is ready for prod traffic
3/18 - 3/25 : full security group logic is in place replicating existing physical network
milestone : security : network security is in place and all tiers use proxies for communication
milestone : load testing complete for region 1

4/1

3/25 - 4/1 : moved master from PHX1 to region 1 AWS
dynect is changed to balance between AWS region 1 and PHX1. SCL2 sits running as a backup
milestone : all db writes are now going to AWS
milestone : AWS region 1 is live in production, SCL2 no longer receives traffic

State at end of Q1 2013

SCL2 is dark
Production is running off of 1 AWS region and 1 physical datacenter
runbooks for AWS deployments & core troubleshooting have been developed
The staging environment has been moved to AWS
- key differences between production and staging AWS areas: server localization & access
monitoring: existing monitoring minus some cepmon rate-of-change monitors has been moved into a new nagios deployment in AWS
alerting: existing minus cepmon-triggered stuff has been migrated

Q2 2013 roadmap

In Q2 DevOps will be bringing up the second AWS region and executing remaining tasks to get us to a truly highly available architecture, ready to graduate from beta

4/8

3/25 - 4/8 : spun up region 2 AWS
3/25 - 4/8 : determined how to do log processing (logstash?) and pump data into zenoss

4/15

4/8 - 4/15 : load tested/validated that region 2 is ready for prod traffic
4/1 - 4/15 : written auto provisioning logic to call AWS and spin up instances, assign them roles, and pass them to chef for provisioning
dynect is changed to balance between AWS region 1 and AWS region 2
milestone : persona is fully hosted in AWS multi-region

4/30

Final day to turn down servers at SCL2

5/13

Modify DB architecture to remove single point of failure (single write master)
- This is not re-evaluating our choice of persistence. It's just making our existing architecture truly fault-tolerant and highly available.
Add more performance monitoring to enable later platform improvements
- There are many ways we could further scale. To make intelligent choices, we need to gather information about the performance and behavior of our servers.

Beyond

Additional Operational Improvements

Identity/DevOps

Contents

Roadmap

Q1 2013 roadmap

State at end of Q1 2013

Q2 2013 roadmap

Beyond