MDN/Archives/Development/CompatibilityTables/Importer
Note: If you came to this page from a link on https://browsercompat.herokuapp.com/importer, then it may be that additional documentation has not been written for the importer issue. Feel free to add a Level 3 heading for your issue.
General Information
It is not enough to create a data store for compatibility data. It also needs to be populated with structured data. We've decided to start with MDN data, rather than start from scratch or an existing data source. An MDN data importer is now part of the web-platform-compat project, and is live at https://browsercompat.herokuapp.com/importer/.
Expected MDN content
The importer works with the raw versions of pages, which contains HTML with KumaScript tags. For example, the MDN page about the HTML <p> element is:
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p
and the raw version of the page is:
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p?raw
You can also see the raw version by editing a page and selection the "Source" button in the upper left corner.
The importer is expecting a page that matches this pattern (most pages are more complex):
<h2 id="Summary">Summary</h2> <!-- ... Other content .... --> <h2 id="Specifications" name="Specifications">Specifications</h2> <table class="standard-table"> <thead> <tr> <th scope="col">Specification</th> <th scope="col">Status</th> <th scope="col">Comment</th> </tr> </thead> <tbody> <tr> <td>{{SpecName('HTML WHATWG', 'grouping-content.html#the-p-element', '<p>')}}</td> <td>{{Spec2('HTML WHATWG')}}</td> <td> </td> </tr> </tbody> </table> <h2 id="Browser_compatibility" name="Browser_compatibility">Browser compatibility</h2> <div> {{CompatibilityTable}}</div> <div id="compat-desktop"> <table class="compat-table"> <tbody> <tr> <th>Feature</th> <th>Chrome</th> <th>Firefox (Gecko)</th> </tr> <tr> <td>Basic support</td> <td>1.0</td> <td>{{CompatGeckoDesktop("1.0")}} [1]</td> </tr> </tbody> </table> </div> <div id="compat-mobile"> <table class="compat-table"> <tbody> <tr> <th>Feature</th> <th>Android</th> <th>Firefox Mobile (Gecko)</th> </tr> <tr> <td>Basic support</td> <td>{{CompatVersionUnknown}}</td> <td>{{CompatGeckoMobile("1.0")}}</td> </tr> </tbody> </table> </div> <p>[1] This is a footnote</p> <h2 id="See_also">See also</h2> <!-- ... Rest of content ... -->
The importer is flexible about whitespace and some common MDN alternate patterns, but this flexibility has to be built in. If the page uses valid but unexpected HTML, the importer will fail, usually with a "section_skipped" critical error.
The Parser
The importer uses a Parsing Expression Grammar (PEG) library to parse the raw MDN page. This can extract the useful data, as well as report the position and range of some unexpected content. The page grammar is in the source code, and, while precise, can be difficult to understand.
For example, to parse a row in the specifications table:
<tr> <td>{{SpecName('HTML WHATWG', 'grouping-content.html#the-p-element', '<p>')}}</td> <td>{{Spec2('HTML WHATWG')}}</td> <td> </td> </tr>
the grammar specifies:
spec_row = tr_open _ specname_td _ spec2_td _ specdesc_td _ "</tr>" _ specname_td = td_open _ kumascript "</td>" spec2_td = td_open _ kumascript "</td>" specdesc_td = td_open _ inner_td _ "</td>" inner_td = ~r"(?P<content>.*?(?=</td>))"s td_open = "<td" _ opt_attrs ">" ... other rules ... _ = ~r"[ \t\r\n]*"s
The first line (known as a "rule") says "Expect a tr_open
, followed by optional whitespace, followed by a spec_name
, ...", where the rules tr_open
and spec_name
elements are defined further down in the grammar. The PEG engine tries to match the MDN page against the grammar. If successful, the content defined by the elements can be extracted for further processing. If the grammar doesn't match, then the rule where matching stopped can be reported for a human to think about.
When fixing an error
If you want to help fix these errors, the best place to find an error to fix is the importer issues page. Click on the links to go to the Parse results page for each page that raised an issue.
On these pages you will see a variety of data about each MDN page's compat data important status. The most useful items are "MDN URL", "Issues", and "Actions" > "Reset" button.
You need to read the "Issues" information to find out what the problem is. Look up the issue code in "The Issues" section of this page, below, to find out how to tackle fixing the problem.
Click on the link in the "MDN URL" section to go directly to the MDN page that has a problem, and edit the page to try to fix the problem.
Next, click the "Reset" button to get the system to re-download and re-parse the MDN page. If you successfully fixed the problem, the "Issues" section should list a message of "None detected." If this isn't the case, repeat the process and try again.
If you can't fix an error, contact the MDN writers about it using MDN's IRC channels or mailing lists.
The Issues
The importer identifies classes of issues with a slug, a short span of text. The sections below use those slugs as the title, so that we can link directly from the importer to advice for handling that class of issue.
bad_json
Error template:
Response from {url} is not JSON
Actual content:
{content}
No one has contributed hints for handling this issue yet.
compatgeckodesktop_unknown
Error template:
Unknown Gecko version "{version}"
The importer does not recognize this version for CompatGeckoDesktop. Change the MDN page or update the importer.
Possible solutions:
- You need to make sure you use the correct macro call: CompatGeckoDesktop(" ... ")
- The "..." above needs to be replaced with the version of Gecko you want listed.
- The version of Gecko needs to exist! To check that it exists if you are not sure, check for it on the Firefox Developer Release Notes page.
- If you want to state Firefox 3.5, the version number you need to enter is actually 1.9.1. See Element.querySelector.
compatgeckofxos_override
Error template:
Override "{override}" is invalid for Gecko version "{version}".
The importer does not recognize this override for CompatGeckoFxOS. Change the MDN page or update the importer.
No one has contributed hints for handling this issue yet.
compatgeckofxos_unknown
Error template:
Unknown Gecko version "{version}"
The importer does not recognize this version for CompatGeckoFxOS. Change the MDN page or update the importer.
No one has contributed hints for handling this issue yet.
exception
Error template:
Unhandled exception
{traceback}
No one has contributed hints for handling this issue yet.
extra_cell
Error template:
Extra cell in compatibility table row.
A row in the compatibility table has more cells than the header row. It may be the cell identified in the context, a different cell in the row, or a missing header cell.
No one has contributed hints for handling this issue yet.
failed_download
Error template:
Failed to download {url}.
Status {status}, Content:
{text}
No one has contributed hints for handling this issue yet.
false_start
Error template:
No <h2> found in page.
A compatibility table must be after a proper <h2> to be imported.
No one has contributed hints for handling this issue yet.
footnote_feature
Error template:
Footnotes are not allowed on features
The Feature model does not include a notes field. Remove the footnote from the feature.
No one has contributed hints for handling this issue yet.
footnote_missing
Error template:
Footnote [{footnote_id}] not found.
The compatibility table has a reference to footnote "{footnote_id}", but no matching footnote was found. This may be due to parse issues in the footnotes section, a typo in the MDN page, or a footnote that was removed without removing the footnote reference from the table.
No one has contributed hints for handling this issue yet.
footnote_multiple
Error template:
Only one footnote allowed per compatibility cell.
The API supports only one footnote per support assertion. Combine footnotes [{prev_footnote_id}] and [{footnote_id}], or remove one of them.
No one has contributed hints for handling this issue yet.
footnote_no_id
Error template:
Footnote has no ID.
Footnote references, such as [1], are used to link the footnote to the support assertion in the compatibility table. Reformat the MDN page to use footnote references.
No one has contributed hints for handling this issue yet.
footnote_unused
Error template:
Footnote [{footnote_id}] is unused.
No cells in the compatibility table included the footnote reference [{footnote_id}]. This could be due to a issue importing the compatibility cell, a typo on the MDN page, or an extra footnote that should be removed from the MDN page.
No one has contributed hints for handling this issue yet.
halt_import
Error template:
Unable to finish importing MDN page.
The importer was unable to finish parsing the MDN page. This may be due to a duplicated section, or other unexpected content.
No one has contributed hints for handling this issue yet.
inline_text
Error template:
Unknown inline support text "{text}".
The API schema does not include inline notes. This text needs to be converted to a footnote, converted to a support attribute (which may require an importer update), or removed.
Possible solutions:
- A common case that causes this error is where you insert a browser version number or compatibility macro into a Browser compatibility table, but then want to include some supporting data or information about an edge case or quirk next to or just below it. The correct way to deal with this is to insert the extra information as a footnote — see the Element.querySelector Browser compat table for an example of correct usage.
nested_p
Error template:
Nested <p> tags are not supported.
Edit the MDN page to remove the nested <p> tag
No one has contributed hints for handling this issue yet.
section_missed
Error template:
Section <h2>{title}</h2> was not imported.
The import of section {title} failed, but no parse error was detected. This is usually because of a previous critical error, which must be cleared before any parsing can be attempted.
No one has contributed hints for handling this issue yet.
section_skipped
Error template:
Section <h2>{title}</h2> has unexpected content.
The parser was trying to match rule "{rule_name}", but was unable to understand some unexpected content. This may be markup or text, or a side-effect of previous issues. Look closely at the context (as well as any previous issues) to find the problem content.
Possible solutions:
- Often, this is caused by the use of plain text or HTML instead of an expected KumaScript macro. For instance, specification tables should be using the SpecName and Spec2 macros instead of specifying text directly.
- When the proper specification tables and macros are not used, and instead a simple link to the spec is provided, follow these steps to resolve (the example I fixed when writing these steps was Timeranges.start()):
- Copy a proper spec table from a reliable source, for example the Fetch API spec table
- Paste this into the "Specifications" section of the problem page.
- Replace the specification identifying name in the SpecName(' ... ') and Spec2(' ... ') templates with the name of the spec where the feature is specified. You can look up what name to use for that particular spec in the SpecName template page. For example, Timeranges.start() is specified in the WHATWG HTML Living Standard. In the SpecName template its name is listed as "HTML WHATWG", so that's what you'll need to use.
- If the page you are fixing is for a specific API landing page, the above steps should be enough. If the page is for a specific feature like a property or method, keep reading!
- The SpecName(' ... ') template can take two other arguments. The first one is the URL slug that when combined with the spec's base URL will point to the exact feature in the spec. For example, the HTML WHATWG spec's URL is https://html.spec.whatwg.org/multipage/, and the URL to the Timeranges.start() method is https://html.spec.whatwg.org/multipage/embedded-content.html#dom-timeranges-start, so the second argument needs to contain 'embedded-content.html#dom-timeranges-start'.
- The third argument needs to contain a human-readable name for the feature, in this case 'start()'.
- the full template call is SpecName('HTML WHATWG','embedded-content.html#dom-timeranges-start','start()')
- The "Browser compatibility table" should be structured just like the one on the Fetch API landing page. If some diffrent kind of table is being used, replace it with a table of this structure.
spec_h2_id
Error template:
Expected <h2 id="Specifications">, actual id=Template:H2 id
Fix the id so that the table of contents, other feature work.
No one has contributed hints for handling this issue yet.
spec_h2_name
Error template:
Expected <h2 name="Specifications">, actual name=Template:H2 name
Fix or remove the name attribute.
No one has contributed hints for handling this issue yet.
spec_mismatch
Error template:
SpecName({specname_key}, ...) does not match Spec2({spec2_key}).
SpecName and Spec2 must refer to the same mdn_key. Update the MDN page.
No one has contributed hints for handling this issue yet.
tag_dropped
This tends to occur when something unexpected appears in one or more of the browser compat table cells, such as a <code> element, or link. If you need to include anything unusual like a link to more information, put it in a footnote, as seen in Web Workers API.
unexpected_attribute
Error template:
Unexpected attribute {tag}
For <p>, the importer expects no attributes. This unexpected attribute will be discarded.
Possible solutions:
- This is caused by an attribute within an HTML tag, which is not expected to have that attribute, e.g. when a
<p>
tag has anid
attribute. In those cases you may just remove the attribute from the tag. If you feel, the attribute is valid at that place, file a new bug against the importer and mark it as blocker for bug 1132269.
unknown_browser
Error template:
Unknown Browser "{name}".
The API does not have a browser with the name "{name}". This could be a typo on the MDN page, or the browser needs to be added to the API.
No one has contributed hints for handling this issue yet.
unknown_kumascript
Error template:
Unknown KumaScript {display} in {scope}.
The importer has to run custom code to import KumaScript, and it hasn't been taught how to import {name} when it appears in a {scope}. File a bug, or convert the MDN page to not use this KumaScript macro.
Possible solutions:
- This is caused by a macro unknown to the importer, e.g. by using
geckoRelease
within the compatibility hints. This is already filed as bug 1134450. If you feel, your issue is not covered by that bug and the importer should be able to handle the macro, file a new bug against it and mark it as blocker for bug 1132269, otherwise remove the macro.
unknown_spec
Error template:
Unknown Specification "{key}".
The API does not have a specification with mdn_key "{key}". This could be a typo on the MDN page, or the specfication needs to be added to the API.
No one has contributed hints for handling this issue yet.
unknown_version
Error template:
Unknown version "{version}" for browser "{browser_name}"
The API does not have a version "{version}" for browser "{browser_name} (id {browser_id}, slug "{browser_slug}"). This could be a typo on the MDN page, or the version needs to be added to the API.
No one has contributed hints for handling this issue yet.