Confirmed users
2,456
edits
No edit summary |
(→Auth: use auth0) |
||
(13 intermediate revisions by 4 users not shown) | |||
Line 6: | Line 6: | ||
It's not ironclad, and is subject to change by consensus. | It's not ironclad, and is subject to change by consensus. | ||
However, if you find yourself making a decision contrary to what's described here, give some extra thought as to why, and how it may cause you or someone else pain down the road. | However, if you find yourself making a decision contrary to what's described here, give some extra thought as to why, and how it may cause you or someone else pain down the road. | ||
= Project Boundaries = | |||
== UNIX Philosophy == | |||
Build many, simple-to-understand tools that communicate using well-defined interfaces and have clear dependencies. | |||
Define the boundaries to these tools clearly: | |||
* dedicated version control repo | |||
* dedicated package for deployment | |||
* dedicated puppetagain class | |||
* dedicated documentation | |||
Avoid the temptation to toss new projects into the build/tools repository, even if they are small. | |||
The [http://12factor.net/ Twelve-Factor App] is a good encapsulation of what appears below. | |||
= Programming Language & Framework = | = Programming Language & Framework = | ||
== Server-side == | == Server-side == | ||
For server-side stuff, use Python | For server-side stuff, use Python 3, with legacy python support only if needed. | ||
See also our [[ReleaseEngineering/Python Standards|python standards]] document. | |||
Within that: | Within that: | ||
Line 15: | Line 31: | ||
* for DB interfaces: sqlalchemy | * for DB interfaces: sqlalchemy | ||
* for an HTTP client: requests | * for an HTTP client: requests | ||
* for | * for async: asyncio | ||
..and the don'ts: | ..and the don'ts: | ||
* don't manage daemonization yourself; plan to use supervisord in production | * don't manage daemonization yourself; plan to use supervisord in production | ||
* do '''not''' use async libraries like twisted or gevent if you can help it | * do '''not''' use async libraries like twisted or gevent if you can help it | ||
== Browser-side == | == Browser-side == | ||
Use https://neutrino.js.org/ or https://reactjs.org/ | |||
= External Dependencies = | = External Dependencies = | ||
* For DB's: MySQL | * For DB's: MySQL or postgres | ||
** If you want to use SQLite for development, that's OK, but be aware that you must test thoroughly on MySQL as well - the two are not the same! | ** If you want to use SQLite for development, that's OK, but be aware that you must test thoroughly on MySQL as well - the two are not the same! | ||
* | * Caching: memcached | ||
* Messaging: | * Messaging: amqp | ||
= Resiliency = | = Resiliency = | ||
Think about resiliency from the beginning, as reliable resiliency is generally embedded in the architecture of the tool, making it hard to change later. | Think about resiliency from the beginning, as reliable resiliency is generally embedded in the architecture of the tool, making it hard to change later. | ||
* If you're building a daemon, build it so that it can be restarted it at will without any serious ill effects. | * If you're building a daemon, build it so that it can be restarted it at will without any serious ill effects. | ||
** Add support for gracefully stopping your service so that new versions or configurations can be deployed without downtime | |||
** To this end, do not store state in memory, including in the call stack. If you have a long-running process that you don't want to restart during a transient failure, then break that task up, persist intermediate state somewhere, and be able to pick up where you left off. This is often most easily accomplished with an explicit state machine. It's impossible to achieve with a function. | ** To this end, do not store state in memory, including in the call stack. If you have a long-running process that you don't want to restart during a transient failure, then break that task up, persist intermediate state somewhere, and be able to pick up where you left off. This is often most easily accomplished with an explicit state machine. It's impossible to achieve with a function. | ||
* Retry everything: HTTP requests, DB queries, DNS lookups, etc. | * Retry everything: HTTP requests, DB queries, DNS lookups, etc. | ||
Line 52: | Line 64: | ||
= Deployment = | = Deployment = | ||
== Server-side == | == Server-side == | ||
Server-side deployments will either be to dedicated boxes managed by [[ | Server-side deployments will either be to dedicated boxes managed by [[ReleaseEngineering/PuppetAgain|PuppetAgain]], or as webapps in the releng cluster. | ||
* Install Python apps as pip-installable tarballs, with [http://semver.org semantic versions], rather than from hg repositories | * Install Python apps as pip-installable tarballs, with [http://semver.org semantic versions], rather than from hg repositories | ||
Line 64: | Line 76: | ||
* Plan to run under mod_wsgi in Apache | * Plan to run under mod_wsgi in Apache | ||
* Plan to run across multiple webheads with dedicated disk, if possible; shared netapp storage is available too if necessary | * Plan to run across multiple webheads with dedicated disk, if possible; shared netapp storage is available too if necessary | ||
= Auth = | = Auth = | ||
Authentication should be handled by auth0. | |||
= Documentation = | = Documentation = | ||
Line 85: | Line 94: | ||
** Passwords available to your application are assumed to be compromised when the application is. If this would represent a privilege escalation, then your entire application becomes security-sensitive. Also, as employees come and go, the passwords must be rotated, so plan ahead for easy password changes. | ** Passwords available to your application are assumed to be compromised when the application is. If this would represent a privilege escalation, then your entire application becomes security-sensitive. Also, as employees come and go, the passwords must be rotated, so plan ahead for easy password changes. | ||
** Don't re-use generic SSH keys, e.g., 'cltbld' or 'id_dsa'. Make a purpose-specific key, document it, and if possible limit its capabilities using authorized_keys on the destination host. | ** Don't re-use generic SSH keys, e.g., 'cltbld' or 'id_dsa'. Make a purpose-specific key, document it, and if possible limit its capabilities using authorized_keys on the destination host. | ||
** Handle secrets carefully, so that they don't end up checked into repositories, pasted into pastebins or etherpads, or sitting in world-readable logfiles | |||
* network | * network | ||
** The Releng Network is isolated from the Internet and the rest of the company, and parts of it deny all but requested flows. Still, this is only one layer. Consider, too, that we often allow less-trusted individuals onto the network for debugging purposes. You should consider the Releng Network a hostile environment: encrypt, authenticate, resist spoofing, and so on. | ** The Releng Network is isolated from the Internet and the rest of the company, and parts of it deny all but requested flows. Still, this is only one layer. Consider, too, that we often allow less-trusted individuals onto the network for debugging purposes. You should consider the Releng Network a hostile environment: encrypt, authenticate, resist spoofing, and so on. | ||
* obscurity | |||
** This generally goes without saying at Mozilla, but don't rely on an attacker not knowing something that isn't explicitly handled as a secret. | |||
= Performance = | = Performance = | ||
Systems don't really scale up very well anymore: you can't get substantially faster processors, faster disk, or faster network. So build your application to scale horizontally -- by adding more instances. | |||
This is easy for a webapp, as webapps must be designed from the start to run across multiple webheads. But it's important for a service, too. Multiple hosts, particularly if they are not in the same location, can achieve both scalability and availability. | |||
Multi-threading can help with performance on a single host, but is really only useful in IO-bound situations, where the threading model makes it easy to deal with blocking operations. | |||
= Configuration = | |||
Applications should read from a single config file, generally in a ConfigParser-compatible format. It's also OK to use YAML. JSON is frowned upon because some parsers are too picky (e.g., about trailing commas). |