AUS:v3: Difference between revisions
Line 8: | Line 8: | ||
== Lookup System == | == Lookup System == | ||
=== | === Fully Normalized Hierarchy === | ||
[[Image:AUSSchema.complete.hierarchy.png]] | [[Image:AUSSchema.complete.hierarchy.png]] |
Revision as of 19:06, 6 November 2007
Database Schema Goals
- Referential integrity in data source
- Add an RDBMS - MySQL likely - that uses a schema designed to store the data required for business rules and update metadata -- including data points we may not even track currently (these need to be defined).
- No duplication of data
- Basic normalization (without OVER normalizing) should quite easily take care of this.
Lookup System
Fully Normalized Hierarchy
Stats:
- number of tables: 17
- number of joins to get data: 16
Sample SQL:
select payload.* from product join pv on (product.name = %PRODUCT% and product.id = pv.productId) join version on (version.name = %VERSION% and version.id = pv.versionId) join pvb on (pv.id = pvb.pvId) join build_id on (build_id.name = %BUILD_ID% and build_id.id = pvb.build_idId) join pvbbt on (pvb.id = pvbbt.pvbId) join build_target on (build_target.name = %BUILD_TARGET% and build_target.id = pvbbt.build_targetId) join pvbbtl on (pvbbt.id = pcbbtl.pvbbtId) join locale on (locale.name = %LOCALE% and locale.id = pvbbtl.localId) join pvbbtlc on (pvbbtl.id = pvbbtlc.pvbbtlId) join channel on (channel.name = %CHANNEL% and channel.id = pvbbtlc.channelId) join pvbbtlcos on (pvbbtlc.id = pcvbbtlcos.pvbbtlcId) join os_version on (os_version.name = %OS_VERSION% and os_version.id = pvbbtlcos.os_versionId) join pvbbtlcosd on (pvbbtlcos.id = pvbbtlcosd.pcbbtlcosId) join distribution in (distribution.name = %DISTRIBUTION% and distribution.id = pvbbtlcosd.distributionId) join payload on (pvbbtlcosd.id = payload.pvbbtlcosdId) join distibution_version on (distibution_version.name = %DISTRIBUTION_VERSION% and distibution_version.id = payload.distibution_versionId)
Advantages:
- simulates the original hierarchical structure
- highly normalized
Disadvantages:
- very verbose SQL
- structure, while faithful to the previous system, doesn't necessarily accurately model the data relationships
Rearranged Hierarchy
The following diagram has an error: the labels channel and os_version have been reversed...
Rearranged Collapsed Hierarchy
Collapsed
Metadata First Cut
Converting an XML Schema into a relational database schema can be pretty straight forward. An element generally corresponds with a row in a table. The attributes inside the start tags correspond to the data in columns of the table. Nested elements are rows in another table that has a foreign key relationship with the table to which the parent element is associated. The tree structure of XML lends itself to orderly relational database schema.
While more complicated relationships can exist between the elements, the schema of this project seems very simple.
<?xml version="1.0"?> <updates> <update type="minor" version="1.0.4" extensionVersion="1.0" detailsURL="http://www.foo.com/1.0.4/whatsnew.html"> <patch type="partial" URL="http://www.foo.com/1.0.4-partial.mar" hashFunction="" hashValue="" size=""/> <patch type="complete" URL="http://www.foo.com/1.0.4-complete.mar" hashFunction="" hashValue="" size=""/> </update> .. <update type="major" version="1.1.2" extensionVersion="1.1" detailsURL="http://www.foo.com/1.1.2/whatsnew.html"> <patch type="complete" URL="http://www.foo.com/1.1.2-complete.mar" hashFunction="" hashValue="" size=""/> </update> </updates>
AUS Metadata Database Schema Naive Version 1
This proposal introduces just two tables: updates and patches. They correspond to the major elements of the aforementioned XML file.
table 1: updates
column name | type | attribute | comment |
id | sequence | primary | |
type | varchar | enum | integer | several options for implementation | |
version | varchar | likely a foreign key? | |
extensionVersion | varchar | likely a foreign key? | |
detailsURL | varchar | is this url completely unique? |
id | type | version | extensionVersion | detailsURL |
0 | minor | 1.0.4 | 1.0 | http://www.foo.com/1.0.4/whatsnew.html |
1 | major | 1.1.2 | 1.1 | http://www.foo.com/1.1.2/whatsnew.html |
table 2: patches
column name | type | attribute | comment |
id | sequence | primary | |
type | varchar | enum | integer | several options for implementation | |
URL | varchar | is this url completely unique? | |
hashFunction | varchar | ||
hashValue | varchar | ||
size | varchar | if it's never actually used as an integer |
|
updateId | integer | FK(updates) |
id | type | URL | hashFunction | hashValue | size | updateId |
0 | partial | http://www.foo.com/1.0.4-partial.mar | 0 | |||
1 | complete | http://www.foo.com/1.0.4-complete.mar | 0 | |||
2 | complete | http://www.foo.com/1.1.2-complete.mar | 1 |
Discussion
This schema is missing any indication of the product. Does this system need to take into account more than just one product? If is does, imagine another enclosing
How does this schema meet the goals at the top of the page?
- ...data required for business rules and update metadata...: What are the business rules? Can't test to see if this is sufficient for the business rules until I have business rules. If the "update metadata" is captured in its entirety by the XML example, the it is safe to judge that this database schema is equivalent.
- Basic normalization (without OVER normalizing)...: This simple schema is normalized if the values in the columns are unique. Each table has a type attribute that could be foreign keys to tables listing the allowed types. Since there are so few options, splitting them off into their own tables would be over normalization. The URL columns, if not unique within their tables, could be consolidated into their own tables leaving behind a foreign key.