Performance/MemShrink/DMD: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
DMD (short for "dark matter detector") is a tool that tracks which heap blocks have been reported by memory reporters.  It helps us reduce the "heap-unclassified" value in Firefox's about:memory page.  It also detects if any heap blocks are reported twice.
The contents of this page have moved to [https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD MDN].
 
== Building and Running ==
 
=== Desktop Firefox (Linux) ===
 
==== Build ====
 
Build with these options:
 
  ac_add_options --enable-dmd
 
If building via try server, modify
<code>browser/config/mozconfigs/linux64/common-opt</code> or a similar file
before pushing.
 
==== Launch ====
 
From a Bourne-style shell, do this:
 
  LD_PRELOAD=$OBJDIR/dist/lib/libdmd.so \
  LD_LIBRARY_PATH=$OBJDIR/dist/lib/ \
  DMD=1 \    # or replace the '1' with one or more DMD options (see below)
  <command>
 
If you want to run under gdb, do this:
 
  LD_PRELOAD=$OBJDIR/dist/lib/libdmd.so \
  LD_LIBRARY_PATH=$OBJDIR/dist/lib/ \
  DMD=1 \    # or replace the '1' with one or more DMD options (see below)
  gdb --args <command>
 
==== Trigger ====
 
Visit about:memory and click the DMD button (depending on how old your build is, it might be labelled "Save" or "Analyze reports" or "DMD"). The button won't be present in non-DMD builds, and will be grayed out in DMD builds if DMD isn't enabled at start-up.
 
This triggers all the memory reporters and then DMD analyzes the reports, printing this commentary:
 
  DMD[11420] opened /tmp/dmd-1409885041-22021.txt.gz for writing
  DMD[11420] AnalyzeReports 1 {
  DMD[11420]  gathering stack trace records...
  DMD[11420]  creating and sorting twice-reported stack trace record array...
  DMD[11420]  creating and sorting unreported stack trace record array...
  DMD[11420]  printing unreported stack trace record array...
  DMD[11420]  creating and sorting once-reported stack trace record array...
  DMD[11420]  printing once-reported stack trace record array...
  DMD[11420] }
 
In an e10s-enabled build, you'll see separate output for each process.
 
If you see the "opened" line, it tells you where the file was saved. If you're on an older build and don't see that, it'll be saved in a file in your current working directory with a <code>.dmd</code> suffix.
 
==== Post-process ====
 
Unzip the file (with <tt>gunzip</tt>) if necessary, and then post-process the output file using <tt>tools/rb/fix_linux_stack.py</tt>, which reads from <tt>stdin</tt> and writes to <tt>stdout</tt>. This step is slow, and can take 2 minutes or more.
 
=== Desktop Firefox (Mac) ===
 
==== Build ====
 
Build with these options:
 
  ac_add_options --enable-debug
  ac_add_options --enable-dmd
 
''Note: non-debug DMD builds do not currently work on Mac. see {{bug|995443}}.''
 
If building via try server, modify
<code>browser/config/mozconfigs/macosx64/common-opt</code> or a similar file
before pushing.
 
==== Launch ====
 
Start the browser like this:
 
  DYLD_INSERT_LIBRARIES=$OBJDIR/dist/lib/libdmd.dylib \
  LD_LIBRARY_PATH=$OBJDIR/dist/lib/ \
  DMD=1 \    # or replace the '1' with one or more DMD options (see below)
  <command>
 
If you want to run under lldb, do this:
 
  DYLD_INSERT_LIBRARIES=$OBJDIR/dist/lib/libdmd.dylib \
  LD_LIBRARY_PATH=$OBJDIR/dist/lib/ \
  DMD=1 \    # or replace the '1' with one or more DMD options (see below)
  lldb -- <command>
 
==== Trigger ====
 
Follow the [[#Trigger|Trigger instructions for Linux]]. Note that on Mac this step can
take 30+ seconds.
 
==== Post-process ====
 
Post-process the output file using <tt>tools/rb/fix_macosx_stack.py</tt>, which
reads from <tt>stdin</tt> and writes to <tt>stdout</tt>. This step is slow, and
can take 30 seconds or more.
 
If your build does not contain symbols, for example because you did not compile with <tt>--enable-profiling</tt>, this may not get you proper stacks. However, if you have crash reporter symbols for your build (tryserver builds do!), you can use [https://github.com/mstange/analyze-tryserver-profiles/blob/master/resymbolicate_dmd.py this script] instead - clone the whole repo, edit the paths at the top of <tt>resymbolicate_dmd.py</tt> and run it.
 
=== Desktop Firefox (Windows) ===
 
==== Build ====
 
Build with these options:
 
  ac_add_options --enable-dmd
  ac_add_options --enable-profiling
 
If building via try server, modify
<code>browser/config/mozconfigs/win32/common-opt</code>. Also, add this line to
<tt>build/mozconfig.common</tt>:
 
  MOZ_CRASHREPORTER_UPLOAD_FULL_SYMBOLS=1
 
==== Launch ====
 
On a local build, start the browser like this:
 
  set MOZ_REPLACE_MALLOC_LIB=path\to\dmd.dll
  set DMD=1    # or replace the '1' with one or more DMD options (see below)
  <command>
 
On a build done by the try server, follow [https://bugzilla.mozilla.org/show_bug.cgi?id=936784#c69 these instructions] instead.
 
==== Trigger ====
 
Follow the [[#Trigger|Trigger instructions for Linux]].
 
=== B2G ===
 
==== Build ====
 
'''First, update your B2G checkout''' with <tt>git pull</tt> or <tt>git fetch && git merge origin/master</tt>.  <tt>./repo sync</tt> is not sufficient!  You must git pull to get the latest version of the relevant tools.
 
For B2G device builds, we don't usually modify the mozconfig (although you can; it's hiding under gonk-misc/default-gecko-config). Instead, modify your .userconfig and add
 
  export MOZ_DMD=1
 
You probably need to clobber your objdir (rm -rf objdir-gecko).  Then build normally.
 
==== Launch ====
 
Your build will then automatically launch with DMD enabled.  (The <tt>b2g.sh</tt> script figures out whether to enable DMD by checking for the presence of <tt>libdmd.so</tt> in <tt>/system/b2g</tt>.) You'll see a message in logcat when a process starts up:
 
  I/DMD    (  305): $DMD = '1'
  I/DMD    (  305): DMD is enabled
 
The <tt>run-gdb.sh</tt> script also knows to start DMD builds with DMD enabled, so you don't need to do anything special.
 
If you want to run B2G on a device with non-default options, you'll need to modify the value of <tt>DMD</tt> in the <tt>gonk-misc/b2g.sh</tt> script and then push it to the device like so:
 
  adb shell stop b2g
  adb remount
  adb push b2g.sh /system/bin
  adb shell chmod 0755 /system/bin/b2g.sh
  adb shell start b2g
 
If you want to run B2G on the device under GDB with non-default options, modify the run-gdb.sh script. You don't need to push anything.
 
==== Trigger ====
 
When you want to analyze the contents of memory, run <tt>tools/get_about_memory.py</tt>.  You should see output like the following:
 
    $ ./get_about_memory.py
    Got 3/3 files.
    Pulled files into about-memory-18.
    Got 3 DMD dump(s).
    [...]
    Done processing DMD files.  Have a look in about-memory-18.
 
'''Note:''' If you are not using the default <tt>GECKO_OBJDIR</tt> you'll need to pass that into <tt>get_about_memory.py</tt>
    $ ./get_about_memory.py --gecko-objdir $OBJDIR
 
<tt>get_about_memory.py</tt> invokes <tt>fix_b2g_stack.py</tt> to post-process stack traces, so you shouldn't need to run it yourself, but it's there in case you need it.
 
See <tt>get_about_memory.py --help</tt> for more options, but you probably won't need anything other than the defaults.
 
=== B2G Desktop Client (Linux) ===
 
==== Build ====
 
[https://developer.mozilla.org/en-US/Firefox_OS/Building_the_B2G_desktop_client Build the B2G Desktop Client], adding these options to the
[https://developer.mozilla.org/en/docs/Configuring_Build_Options mozconfig] file:
 
  ac_add_options --enable-debug
  ac_add_options --enable-dmd
 
==== Launch ====
 
Start b2g like this:
 
  LD_PRELOAD=$OBJDIR/dist/lib/libdmd.so \
  LD_LIBRARY_PATH=$OBJDIR/dist/lib/ \
  DMD=1 \    # or replace the '1' with one or more DMD options (see below)
  $OBJDIR/dist/bin/b2g -profile $GAIA/profile-debug
 
==== Trigger ====
 
[https://wiki.mozilla.org/FirefoxOS/Performance/Debugging_OOMs#Step_2.2C_option_4:_Run_B2G_on_your_desktop Send signal 34 to the b2g process]:
 
  $ killall -34 b2g
 
The dmd gzipped log is written to the <tt>/tmp</tt> directory, e.g.
<code>/tmp/dmd-1406557150-14443.txt.gz</code>.
 
==== Post-process ====
 
Follow the [[#Post-process|Post-process instructions for desktop (Linux)]].
 
=== Fennec ===
 
'''Due to [https://bugzilla.mozilla.org/show_bug.cgi?id=823354 bug 823354], you may get empty stack traces on Fennec, rendering DMD's output largely unhelpful.  Sorry.'''
 
==== Build ====
 
Build with these options:
 
  ac_add_options --enable-dmd
 
==== Launch ====
 
Launch with the following commands (be sure to replace "org.mozilla.fennec" with the app identifier as appropriate; this will usually be org.mozilla.fennec_$USERNAME for a local build).
 
  adb push $OBJDIR/dist/lib/libdmd.so /sdcard/
  adb shell am start -n org.mozilla.fennec/.App \
    --es env0 MOZ_REPLACE_MALLOC_LIB=/sdcard/libdmd.so \
    --es env1 DMD=1  # or replace the '1' with one or more DMD options (see below)
 
The commentary on Fennec goes to logcat, and looks like this:
 
  I/DMD  (27314): $DMD = '1'
  I/DMD  (27314): DMD is enabled
 
The number in the parentheses is the process ID.
 
==== Trigger ====
 
Use the existing memory-report dumping hook:
 
  adb shell am broadcast -a org.mozilla.gecko.MEMORY_DUMP
 
In logcat, you should see output similar to this:
 
  E/GeckoConsole (27314): nsIMemoryInfoDumper dumped reports to /data/data/org.mozilla.fennec_kats/app_tmp/memory-report-default-27314.json.gz
 
The path (should always be /data/data/$APPID/app_tmp/) is where the memory reports and DMD reports get dumped to. You can pull them like so:
 
  adb pull /data/data/org.mozilla.fennec_kats/app_tmp/memory-report-default-27314.json.gz
  adb pull /data/data/org.mozilla.fennec_kats/app_tmp/dmd-default-27314.txt.gz
 
== Interpreting the output ==
 
DMD's output is broken into multiple sections.
 
# "Invocation".  This tells you how DMD was invoked, i.e. what options were used.
# "Twice-reported stack trace records".  This tells you which heap blocks were reported twice or more.  The presence of any such records indicates bugs in one or more memory reporters.
# "Unreported stack trace records".  This tells you which heap blocks were not reported, which indicate where additional memory reporters would be most helpful.
# "Once-reported stack trace records": like the "Unreported stack trace records" section, but for blocks reported once.
# "Summary": gives measurements of the total heap, and the unreported/once-reported/twice-reported portions of it.
# "Execution measurements": gives some statistics about DMD's execution, which are mostly of interest to DMD's developers.
 
The "Twice-reported stack trace records" and "Unreported stack trace records" sections are the most important, because they indicate ways in which the memory reporters can be improved.
 
Here's an example stack trace record from the "Unreported stack trace records" section.
 
Unreported: 3 blocks in stack trace record 209 of 1,891
  36,864 bytes (26,184 requested / 10,680 slop)
  0.03% of the heap (64.55% cumulative);  0.04% of unreported (86.78% cumulative)
  Allocated at
    malloc (/home/njn/moz/mi2/memory/build/replace_malloc.c:151) 0x417170
    PR_Malloc (/home/njn/moz/mi2/nsprpub/pr/src/malloc/prmem.c:435) 0x7f68650f423c
    PL_ArenaAllocate (/home/njn/moz/mi2/nsprpub/lib/ds/plarena.c:200) 0x7f68652463e1
    nsFixedSizeAllocator::Alloc(unsigned long) (/home/njn/moz/mi2/xpcom/ds/nsFixedSizeAllocator.cpp:95) 0x7f6860f528dc
    nsNodeInfo::Create(nsIAtom*, nsIAtom*, int, unsigned short, nsIAtom*, nsNodeInfoManager*) (/home/njn/moz/mi2/content/base/src/nsNodeInfo.cpp:64) 0x7f685f640933
    nsNodeInfoManager::GetNodeInfo(nsIAtom*, nsIAtom*, int, unsigned short, nsIAtom*) (/home/njn/moz/mi2/content/base/src/nsNodeInfoManager.cpp:225) 0x7f685f642d05
    mozilla::dom::Element::SetAttrAndNotify(int, nsIAtom*, nsIAtom*, nsAttrValue const&, nsAttrValue&, unsigned char, bool, bool, bool) (/home/njn/moz/mi2/content/base/src/Element.cpp:1862) 0x7f685f60ad87
    mozilla::dom::Element::SetAttr(int, nsIAtom*, nsIAtom*, nsAString_internal const&, bool) (/home/njn/moz/mi2/content/base/src/Element.cpp:1778) 0x7f685f60a9b3
    nsXMLContentSink::AddAttributes(unsigned short const**, nsIContent*) (/home/njn/moz/mi2/content/xml/document/src/nsXMLContentSink.cpp:1464) 0x7f685fa76c5c
    nsXBLContentSink::AddAttributes(unsigned short const**, nsIContent*) (/home/njn/moz/mi2/content/xbl/src/nsXBLContentSink.cpp:882) 0x7f685fb3ad42
    nsXMLContentSink::HandleStartElement(unsigned short const*, unsigned short const**, unsigned int, int, unsigned int, bool) (/home/njn/moz/mi2/content/xml/document/src/nsXMLContentSink.cpp:1018) 0x7f685fa73db5
    nsXMLContentSink::HandleStartElement(unsigned short const*, unsigned short const**, unsigned int, int, unsigned int) (/home/njn/moz/mi2/content/xml/document/src/nsXMLContentSink.cpp:947) 0x7f685fa7370a
    nsXBLContentSink::HandleStartElement(unsigned short const*, unsigned short const**, unsigned int, int, unsigned int) (/home/njn/moz/mi2/content/xbl/src/nsXBLContentSink.cpp:258) 0x7f685fb37cc0
 
It tells you that there were 3 heap blocks that were allocated from the program point indicated by the "Allocated at" stack trace, that these blocks took up 36,864 bytes, and that 10,680 of those bytes were "slop" (wasted space caused by the heap allocator rounding up request sizes).  It also indicates what percentage of the total heap size and the unreported portion of the heap these blocks represent.
 
Within each section, records are listed from largest to smallest.
 
Once-reported and twice-reported stack trace records also have stack traces for the report point(s).  For example:
 
Reported at
  mozilla::dmd::Report(void const*) (/home/njn/moz/mi2/memory/replace/dmd/DMD.cpp:1740) 0x7f68652581ca
  CycleCollectorMallocSizeOf(void const*) (/home/njn/moz/mi2/xpcom/base/nsCycleCollector.cpp:3008) 0x7f6860fdfe02
  nsPurpleBuffer::SizeOfExcludingThis(unsigned long (*)(void const*)) const (/home/njn/moz/mi2/xpcom/base/nsCycleCollector.cpp:933) 0x7f6860fdb7af
  nsCycleCollector::SizeOfIncludingThis(unsigned long (*)(void const*), unsigned long*, unsigned long*, unsigned long*, unsigned long*, unsigned long*) const (/home/njn/moz/mi2/xpcom/base/nsCycleCollector.cpp:3029) 0x7f6860fdb6b1
  CycleCollectorMultiReporter::CollectReports(nsIMemoryMultiReporterCallback*, nsISupports*) (/home/njn/moz/mi2/xpcom/base/nsCycleCollector.cpp:3075) 0x7f6860fde432
  nsMemoryInfoDumper::DumpMemoryReportsToFileImpl(nsAString_internal const&) (/home/njn/moz/mi2/xpcom/base/nsMemoryInfoDumper.cpp:626) 0x7f6860fece79
  nsMemoryInfoDumper::DumpMemoryReportsToFile(nsAString_internal const&, bool, bool) (/home/njn/moz/mi2/xpcom/base/nsMemoryInfoDumper.cpp:344) 0x7f6860febaf9
  mozilla::(anonymous namespace)::DumpMemoryReportsRunnable::Run() (/home/njn/moz/mi2/xpcom/base/nsMemoryInfoDumper.cpp:58) 0x7f6860fefe03
 
You can tell which memory reporter made the report by the name of the MallocSizeOf function near the top of the stack trace.  In this case it was the cycle collector's reporter.
 
By default, DMD measures heap blocks above a certain size precisely, but uses sampling to measure blocks below that size.  Any measurements that involve sampled blocks (even if combined with non-sampled measurements) are approximate, and this is indicated by a preceding '~'.  For example:
 
Unreported: ~273 blocks in block group 17 of 14,611
  ~1,125,590 bytes (~1,117,936 requested / ~7,654 slop)
  0.07% of the heap (2.58% cumulative);  0.43% of unreported (16.36% cumulative)
 
The sampling threshold can be adjusted with an option (see below).  This will affect the precision of the output and the speed at which Firefox+DMD runs.
 
== Options ==
 
Setting the <tt>DMD</tt> environment variable to <tt>1</tt> gives default options.  But you can also specify non-default options by setting <tt>DMD</tt> to a whitespace separated list of <tt>--option=val</tt> entries.
 
==== --sample-below=<1..n> ====
 
By default, DMD samples blocks with a sample-below size of 4093.  I.e. it ignores some small allocations in order to run (much) faster.
 
If you want DMD to record all allocations precisely, pass <tt>--sample-below=1</tt>.  Otherwise, you should probably leave it unchanged.  If you do pick a different value, prime numbers work best.
 
==== --max-frames=<1..24> ====
 
By default, DMD stack traces do not exceed 24 frames. You can reduce this.
 
==== --max-records=<1..1000000> ====
 
By default, DMD will print 1000 stack trace records of each kind. You can
increase or decrease this.
 
==== --mode=<normal|test|stress> ====
 
<tt>--mode=<normal|test|stress></tt> can be used to invoke "test" or "stress" mode, which are useful if you're hacking on DMD.  The default is normal mode.
 
"test" and "stress" modes set their own <tt>--sample-below</tt> values, so you
should never have to specify both <tt>--sample-below</tt> and <tt>--mode</tt>.
 
To run the tests, specify <tt>--mode=test</tt> and start Firefox. It will print out some stuff and very quickly exit. Then run the following command from the top of your source directory.
 
  memory/replace/dmd/check_test_output.py . test.dmd
 
(If you invoked Firefox from a different directory to your source directory, you might need to specify the path to <tt>test.dmd</tt>.)
 
This script checks the output produced by the previous step, and will indicate if the test passed or failed. It should work on Linux and Mac, but is unreliable on Windows.
 
== Which heap blocks are reported? ==
 
At this stage you might wonder how DMD knows which allocations have been reported and which haven't.  DMD only knows about heap blocks that are measured via a function created with one of the following two macros:
 
  MOZ_DEFINE_MALLOC_SIZE_OF
  MOZ_DEFINE_MALLOC_SIZE_OF_ON_ALLOC
 
Fortunately, most of the existing memory reporters do this.  See [[Platform/Memory_Reporting]] for more details about how memory reporters are written.
 
== Troubleshooting ==
 
Contact Nick Nethercote ("njn" on IRC) or Nathan Froyd ("froydnj" on IRC).

Latest revision as of 05:10, 21 October 2014

The contents of this page have moved to MDN.