9
edits
Cypherpunks (talk | contribs) (Added a project for sanitizer) |
|||
Line 90: | Line 90: | ||
Add support for interactive form objects, which allow users to fill out input fields (text, checkboxes, radio buttons, etc.) in a PDF and then "submit" them. We need to find out whether more PDFs with forms use XFA or AcroForms. Rumor has it that most tax forms are XFA, which is the biggest use case driving this so far. Once we figure out whether XFA or AcroForms is more important to implement, then we need to research whether these controls map 1:1 to HTML5 form controls. If so, implementing forms would approximately mean overlaying the PDF display with the appropriate HTML5 elements. We also need to research how these controls can be styled; that might be difficult to implement. Lastly, forms have some support for running JavaScript, which we may not want to allow. It would be an interesting problem to see if such script could be sandboxed properly. | Add support for interactive form objects, which allow users to fill out input fields (text, checkboxes, radio buttons, etc.) in a PDF and then "submit" them. We need to find out whether more PDFs with forms use XFA or AcroForms. Rumor has it that most tax forms are XFA, which is the biggest use case driving this so far. Once we figure out whether XFA or AcroForms is more important to implement, then we need to research whether these controls map 1:1 to HTML5 form controls. If so, implementing forms would approximately mean overlaying the PDF display with the appropriate HTML5 elements. We also need to research how these controls can be styled; that might be difficult to implement. Lastly, forms have some support for running JavaScript, which we may not want to allow. It would be an interesting problem to see if such script could be sandboxed properly. | ||
==== Big project: Sanitizer ==== | |||
PDF files often have malicious content within itself, which can be used to compromise the security of the system. Rendering PDF file with PDF.js is often slow and broken, which makes the users to open the files with native readers. It will be very useful to have a mean to remove malicious content from PDF. | |||
* Use PDF.js to parse PDF into internal representation, but do not render it. | |||
* Decompress and destream it. | |||
* Remove all potentially malicious tags (this should be tweakable in popup window similar to "Clear Recent History"): JS, flash, 3d, forms, signatures, remote content, anything not needed for rendering. | |||
* Recreate PDF file from the internal representation recomputing all the recomputable fields to mitigate memory corruption exploits. | |||
Firefox should suggest the user to sanitize PDF if he downloads PDF by any mean (either using PDF.js GUI, or FF standard download dialogue). | |||
=== Testing === | === Testing === |
edits