Devops/monitoring-alerting
< Devops
Jump to navigation
Jump to search
Mozilla Foundation Monitoring & Alerting
===== TLDR ===== :
- Performance and Most Infrastructure Monitoring in New Relic
- New Relic Dashboards are a good way to get info fast (login required)
- Load balancer health, database, and overall application healthchecks in Opsview (login required) (Public Viewport)
- Many dashboards for traffic, metrics, performance, and monitoring on this Dashboard of Dashboards
- For accounts, questions, or suggestions, email jp at mozillafoundation.org
- MONITORING TOOLS, SYSTEMS, AND LINKS*
Mozilla Foundation applications are monitored and measured in a number of systems:
- Opsview, a Nagios clone with a much friendlier interface.
- * Monitors and alerts when servers in load balancers are unhealthy
- * Monitors and alerts on uptime/downtime of overall endpoints, such as https://webmaker.org
- * Monitors and alerts on database utilization and downtime.
- "Important Opsview Links'
- Public Status Page
- Current Unhandled Alerts (Login required)
- Recent Alerts in Opsview
- !!!TODO : Add the guide for notifications & contact settings
- New Relic monitoring (Login Required)
- * Watching application response time in browser and server side
- * Watching database and web server utilization, transactions, timings, and throughput
- * Watching load balancer (ELB) metrics
- * Performing serverside and client-side tracing of long running transactions
- * Overall endpoint monitoring, such as https://webmaker.org
- * Watching cache server utilization and metrics
- * Watching Elasticsearch server utilization and metrics
- * Watching Mongo server utilization and metrics
- * Marks and compares new/old deployed versions of software
- !!!TODO : Add the guide for notifications & contact settings
- Log monitoring with Loggins (Kibana) (Login Required)
- "AWS Infrastructure and Autoscaling Monitoring/Alerting"
- * An email group exists to be notified of any autoscaling activities (up or down). Contact jp at mozillafoundation.org to be added to this list.
- * Cloudwatch in the AWS console is capable of monitoring many metrics and utilization metrics, including CPU usage or network usage for a group, database, server, or ELB. Not many alarms are triggered from this outside of to trigger scaling.
- Most AWS infrastructure is monitored via New Relic. See the side menu options in New Relic for RDS, ELB, EC2, Elasticache, etc...