Update:Archive/1.0/Developers: Difference between revisions
No edit summary |
|||
Line 7: | Line 7: | ||
For v1.0 we will be performing a security audit with the results of that being the basis for any changes made to the v1.0 code base. We should not be updating functionality or adding new features in the v1.0 tree; only security and performance fixes. | For v1.0 we will be performing a security audit with the results of that being the basis for any changes made to the v1.0 code base. We should not be updating functionality or adding new features in the v1.0 tree; only security and performance fixes. | ||
== | == Overview == | ||
The goal of our proposed security audit is to verify security and improve the stability of the existing UMO application. | The goal of our proposed security audit is to verify security and improve the stability of the existing UMO application. |
Revision as of 18:27, 14 February 2005
Update: Home Page » Development »
Introduction
This will be the plan for getting to a solid v1.0 that is consumable by the public.
For v1.0 we will be performing a security audit with the results of that being the basis for any changes made to the v1.0 code base. We should not be updating functionality or adding new features in the v1.0 tree; only security and performance fixes.
Overview
The goal of our proposed security audit is to verify security and improve the stability of the existing UMO application.
Our hope is to get everyone on the same page by starting much-needed discussion. From discussion we develop goals, which lead to a plan, then action. Our ability to listen to each other and work together will determine the success of UMO, which directly affects the heart of the Mozilla Foundation.
Goals and Priorities
Security
Be responsible for the security of the Mozilla community.
Reliability
Provide reliable tools that scale gracefully.
Reputation
Keep moving the Foundation forward.
Communication
Improve communication between systems and developers to spot potential problems before they happen.
Auditing UMO
Reviewing UMO should be a collaborative effort between system administrators and developers. The following overview focuses on:
- A developer's view on what a good web app is.
- Ways we plan to test UMO:
- Test for remote vulnerabilities using automated test suites.
- During testing slam staging area with a wide array of inputs (PHP, MySQL, special chars).
- Look for ways to reduce access to sensitive information. For example - a publicly posted DB schema
- Ways we could work with sysadmins to improve UMO security.
- The need for better development practices - particularly with benchmarking and staging.
- Reduction of code and addition of proper comments and documentation to improve extensibility.
- Use of standards or style guides to ensure code consistency.
- Structural improvements aimed to improve the overall quality of UMO.
Security
- Programmer inexperience is a primary cause of security holes.
- Use test suites exist to test input validation.
- Isolate and lockdown the production environment.
Consistent with our priorities, security is our primary concern. UMO has the potential to help or harm whomever it touches. The responsibility is on our shoulders to ensure UMO is the good guy and never the bad.
From a developer's standpoint, security in PHP is really a matter of how experienced an author is. An experienced developer understands the difference between local and remote attacks and adequately prepares for both via sound input validation and cooperation with system administrators.
A good security review would involve automating all input methods and testing each input using an exhaustive combination of SQL injections, PHP function calls or variable references, special characters and other unexpected inputs. Some test suites exist for this purpose, such as TestMaker.
With the use of such an application, it would be possible to create macros to mimic user actions over HTTP and efficiently inject a wide array of inputs. This would be a primary focus - most severe exploits are the result of poor input validation and/or testing.
Once we have reached a relatively safe situation with inputs, we must look at dependencies. Outside of our main script or application, what exists on this server that could pose a potential threat? Many people develop simple scripts such as blogs in distributed environments that might host many applications. In such an environment an application is vulnerable to both local and remote attacks by means of the lowest common denominator.
Isolating UMO from the lowest common denominator is of equal importance. This means limiting access, creating a sandbox for UMO to run in, putting up adequate staging areas that would not affect production in the case of a meltdown, and many other things that are beyond a developer's control and should be handled by sysadmins.
Of note - phpMyAdmin exists in the root of UMO. That is a lowest common denominator; unnecessary software that adds additional vulnerabilities to the system. If someone were using that instance of phpMyAdmin and it crashed Apache or MySQL for any number of unforeseen reasons - is that a situation we should prepare for? Having staging affect production?
Additionally, the entire DB structure for UMO is publicly viewable. This makes it very easy for someone who may attain the ability to inject SQL via tainted input to seriously affect the system. For the same reasons MySQL errors should be caught and written to files rather than displayed on screen. Granted, most open source projects publish their database structure, but it should at least be housed somewhere besides the main web server running the production version of the app.
And really, that's just the beginning. Many things are avoidable by using good practices and clear, concise code. Verbose code, poor whitespace, inadequate comments or lack of developer documentation can all lead to obscure sections of code that never pop up until it's too late.
We understand an audit of UMO requires analyzing every nook and cranny of the application. In addition to questioning the general security principles we have discussed, we will analyze the logic behind each line and ensure there are no security loopholes.
Reliability
- Load testing.
- Proper staging.
- Cooperative planning by sysadmins and developers.
- Sound infrastructure and an understanding of how it works.
Reliability of UMO is a vital part of the Foundation. A secure UMO also has to adequately handle its user base as it grows or it runs the risk of turning away users. Graceful scaling is often accomplished through adequate staging, requiring planning and cooperation between sysadmins and developers. There are a variety of tools available such as apache benchmark that can easily handle this job.
Factored into scalability is a disaster plan for extremely high loads during normal operation. This would include disabled non-core services to save cycles. An example would be disabling logging, non-vital dynamic content or eliminating all dynamic content altogether. This again requires testing so server admins and developers can map hardware capabilities based on application configuration.
Lastly, a soundly-written and highly efficient web application remains at the mercy of its supporting hardware infrastructure. That said, any UMO developer should have utmost confidence in and a solid understanding of his/her server infrastructure - and if not, that should be addressed before considering any other reliability issues.
With all of these things in mind, we will assess how well UMO stacks up and work with sysadmins to determine ways to improve overall stability and reliability. Overall, the key here is communication and teamwork between experienced sysadmins and developers.
DB
- Cooperation between developers and sysadmins to determine optimal conditions under relevant load scenarios.
- Proper data typing.
- Proper database normalization.
- Proper (and not excessive) indexing.
- Understanding your RDBMS and its limitations.
- Taking advantage of Apache and persistent connections.
Optimization of database resources is vital to the scalability of a dynamic web application such as UMO. It takes just a little cooperation to tweak things until you get the results you are looking for.
For example, during the scaling of the current UMO service, we found that there were several very expensive MySQL queries that were happening and were causing frequent outages. Upon further examination we found that these SQL statements were being done on columns that had no indexing in place. With the addition of 3 indexes for critical columns we were able to scale the UMO service even further with the existing hardware.
Another example comes from looking at another scaling issue. When a user hits the UMO service they use an SQL statement that is done by date. However, the date column is just a VARCHAR and thus not ideal for doing queries against. In addition, it was not indexed properly thus causing further slowdowns. Using the right data typing in this instance would yield large performance improvements.
Our work on Bouncer has shown how proper indexing and a concrete schema combined with persistent database connections can greatly reduce load on any RDBMS. We will check UMO's database for proper normalization and indexing. We will also ensure there is no database bottleneck for dynamic content.
Extensibility
- Re-use of code and/or utilization of widely accepted libraries.
- The less code you have to write, the easier it is to manage.
- Adequate developer documentation.
- Project standards such as syntax style, tabs vs. spaces, wrapping, etc.
- Use a popular RDBMS and scripting language.
In many cases the code in UMO should be reduced, particularly when calling redundant procedures that deserve to be housed in a set of wrapper functions. The following example is representative of the UMO source:
$sql = "SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id"; $sql_result = mysql_query($sql, $connection) or trigger_error("MySQL Error ".mysql_errno().": ".mysql_error()."", E_USER_NOTICE);
As an alternative, it would be fairly easy to write a function to accomplish the exact same thing instead of rewriting this block every time a query is performed:
/** * Do a query. * @param resource $dbh MySQL database handle * @param string $sql MySQL query, unescaped * @return resource $result MySQL result */ function db_query($qry=null,$dbh=null) { static $result = null; if(!is_resource($dbh)) $dbh = db_connect(); if(is_null($qry)) { if(is_resource($result)) return $result; else return false; } else { $result = @mysql_query($qry,$dbh) or db_error(mysql_errno()); return $result; } }
Here, $dbh is a static handle so if it has already been instantiated, it can be reused throughout the entire execution of the script -- likewise with $result. db_connect() is another wrapper that connects to the database and db_error() is a wrapper to handle database errors gracefully (and silently) - we will not go into that right now, assume it works for the example. This ultimately lets us to reduce the code:
$sql_result = db_query("SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id");
This is just a small example - don't run it under the microscope. The idea we want to portray is that many times throughout the UMO source the reduction of code could make scripts cleaner, shorter, and more efficient. It also makes the code easier to read and update for future developers, which directly affects extensibility.
For redevelopment, we will likely decide on using a database abstraction layer such as PEAR or adodb. Either one would follow these same guidelines, and would possibly be used in conjunction with similar wrappers.
Notice the comment blocks as well. It is very important to adequately comment all actions. By doing uniform comment blocks before functions and scripts it is possible to use a code documentation utility like phpDoc to generate a complete developer's API on-the-fly as we did with Bouncer. This also greatly improves the extensibility and overall quality of the application.
Attempting to improve the extensibility of UMO would be a potential focus, but not a primary one given the possible redevelopment of UMO. On the other hand, if reduction of code would lead to improved performance, the methodologies discussed above would be applied.
Structure
- Centralized includes, libraries, configuration documents and documentation.
- Files and directories named consistently.
The overall structure of the application can play a role in its performance and security. By centralizing code it would be possible to cache or lockdown certain libraries or configuration files as needed. An added advantage is that it becomes a lot easier to know where certain things are if you have a simpler file structure.
The following is a listing of the current root:
about admin_newaccountmails.php core critical css database developers downformaint.html extensions faq favicon.ico images index.php mrtg phpMyAdmin plugins rss themes update update.rdf
This is a listing of Bouncer's root:
README docs logs perl php admin cfg css img inc index.php lib
It becomes easier to find things by sorting by:
1. Language 2. Roles / access level 3. Action
UMO does not seem to use a consistent naming convention. So if you read the code, it is hard to go through and know exactly where everything is. The lack of documentation also contributes to the confusion. It would be a good idea to reorganize scripts a little better, for example:
admin cfg css dev img lib core pub about extensions plugins themes rdf rss sql
Then, when necessary, create virtual hosts to root in particular directory. For example, http://addons.update.mozilla.org/ could point to the root of /pub. Furthermore, it makes it easier to lockdown remote access to certain directories, like sql, cfg, lib and admin. Other possible benefits are checkouts by directory for a given module. If you wanted to cache all images, you could check out the img directory wherever you wanted, etc.
We intend to look at possible improvements here, but probably would not restructure everything in an audit - it is more a development approach than it is a load or security issue. Structure is important for future discussion regarding development, so we wanted to address the inherent benefits.
Audit Plan
Planning and Preparation
- Eliminate areas that will not ever be implemented (until v2.0)
- Determine which sections of UMO need the most attention
- Create list of items we will review
- Clean code for review (PHP_beautifier)
Code Review
- Document includes, dependencies
- Review application based on priorities, listing needed changes
- Imperative security changes
- Efficiency issues that really need to be changed
- Efficiency or style issues that are cosmetic
- Wiki might be good for cataloguing these
Code Adjustments
- Update code based on quantified 'needed changes'
- If possible, update code for readability and efficiency
Assessment
- Benchmark code to assure scalability
- Check for improvements on performance