Release Management/WER Investigation

From MozillaWiki
Jump to navigation Jump to search

This page will track the initial investigation of Windows Error Reporting (WER). Our current plan is to hook our bug tracking and metrics into Winqual, the site/API used to access Microsoft's crash info. Using this info will help diagnose hangs (which aren't currently tracked) on Windows XP and later, and may even uncover new crashes.

Meetings

Associated Bugs

Resources

Builds

Creating File Mappings

  • MS suggests using their Microsoft Product Feedback Mapping Tool (AppMap.exe) to create a file manifest - required for crash collection for a new product version
  • (?) Need to explore generating this file manifest so that the build machines can create it

Uploading File Mappings

  • MS suggests using the "Upload File Mappings option" on the Administration menu
  • (?) Need to figure out if the WER web API can instead be used to perform an upload

WinQual Update Frequency

  • (?) Determine if there are any limitations to the number of products that can be registered (for possible use with nighties
    • Need to take into consideration the >2 day lag time on getting reports
  • (?) Bandwidth limits for pulling down
  • Only the first few cab files are stored for a crash. The event viewer web interface offers the ability to make a "Data Request" to collect "Processor & Memory Information", the heap, specific files (logs, etc.), additional cab files
    • (?) Can the web API be used to expose this to developers or will we need more accounts?
  • (?) Need to find out what it means when no cab files are available and the web interface offers "(click here to switch to collection mode)" - aren't we always collecting?

Accessing WinQual Data

Windows Live Login

  • (?) Access in winqual requires a Windows Live login. Similarly, in MS's StackHash client requires the use of Windows Live Sign-in Assistant. Need to determine if the associated library is required for logging in.
  • (?) Need to figure out what account we'd use for automated tools

WER Web API

Breakpad/Soccoro

  • (?) Need to understand the overlap between WinQual crash data and Breakpad crash data, map the applicable info, and decide what to do with "additional" info
    • (?) Can we make use of individual hit event info in general? (as opposed to just crash cabs)
  • (?) Will hangs (heap dumps) need to be handled any differently than minidumps?
  • (?) Who do we give access to minidumps?
  • (?) What is our current data retention policy?
    • We may want to keep hangs around for longer since there may be a lot, and they've never been investigated
  • (?) What is our access audit ability?

Cab File Contents (for collector/processor)

  • WERInternalMetadata.xml - (possibly) not present if version.txt is. Includes
    • OSVersionInformation - windows version info, architecture, etc.
    • ProblemSignatures - event type (crash/hang), crashing executable name, exe version/timestamp, methodDef token of faulting method (?), and IL offset of faulting instruction (?)
    • DynamicSignatures -
    • SystemInformation - HW info. What's an MID?
  • AppCompat.txt (also all lower) - not present if WERDataCollectionFailure.txt is. Includes information on all images loaded by the process.
  • WERDataCollectionFailure.txt - includes error message if processing failed in MS.
  • version.txt - only came across this once. Only included OS version.
  • For crashes
  • For hangs
    • <process-name>.xml - additional hang metadata like the wait chain list
    • memory.hdmp - info about the difference between a heapdump and a minidump outlined here

Future Investigations

  • Consider providing a solution (link) when the user is presented with the Windows crash dialog
    • Can even link to an exe (if part of the "Designed for Windows" logo program), which may be a good idea for last ditch effort if even Firefox's safe mode fails (application files no longer pristine, need reinstall).