Firefox OS/Performance/Profiling: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(→‎Profiling with perf(1): Update for the miniperf→profiling merge.)
m (Lakrits moved page FirefoxOS/Performance/Profiling to Firefox OS/Performance/Profiling: The official spelling of "Firefox OS" leaves a space between the two parts of the name. It's easier to find a page if the spelling of its name is consistent...)
 
(12 intermediate revisions by 7 users not shown)
Line 1: Line 1:
= Profiling with the gecko profiler =
= Profiling with the gecko profiler =
Good at: Native stacks (with runtime options) + javascript profiling, low overhead sampling, familiar for gecko developers


See [https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler#Profiling_Boot_to_Gecko_%28with_a_real_device%29 these instructions].  Patches are in-flight to get native stacks in profiles, but that's not in default configurations yet.
See [https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler#Profiling_Boot_to_Gecko_%28with_a_real_device%29 these instructions].  Patches are in-flight to get native stacks in profiles, but that's not in default configurations yet.


= Profiling with perf(1) =
= Profiling with systrace =
 
Good at: Shows process preemption, shows all calls to instrumented functions, Familiar for android developers
Work is in progress to make the Linux kernel profiler, called "perf", useful for debugging on B2G.  See [https://bugzilla.mozilla.org/show_bug.cgi?id=831611 bug 831611] for more information; the main issue for getting it landed is obtaining stack traces, which [https://bugzilla.mozilla.org/show_bug.cgi?id=856899 bug 856899] goes into more detail on.
 
== Quick Start ==
 
This should now work on both Linux and Mac build hosts.
 
# Add git://github.com/jld/B2G.git as a remote and check out the "profiling" branch from it.
# ./config.sh.  Don't set BRANCH here; the default is "profiling-v1", which the v1-train manifests with suitable changes.  You can check out gecko and/or gaia to different versions afterwards.
# "export B2G_PROFILING=1" in .userconfig
# Delete "out" and "objdir-gecko" (or whatever your gecko objdir is named), then ./build.sh.
# ./flash.sh
# Now profile something:
## ./run-perf.sh record-sps
## Do something of interest on the device.
## Hit Enter in the shell window, like the message said to.
# There should have been a line like "Writing profile to perf_20130423_122912.txt".  Go to https://people.mozilla.com/~bgirard/cleopatra/ (or a local clone, if you have one) and feed it that file.
 
== Fine-Tuning ==
 
By default, perf samples based on the CPU's cycle counter, adjusting the period to gather approximately 4000 samples/sec.  However, it gathers nothing while the CPU is idle, and currently Cleopatra (the Gecko profiler front-end) ignores these times — it doesn't display them in the timeline, and its "real interval" is an average over both real inter-sample intervals and idle times.
 
Other timers are available; use the -e flag (when running "./run-perf record-sps") to select one.  In particular, "-e cpu-clock" uses a real-time interval timer, which gather samples even when the CPU is idle.  However, at least on unagi it seems to be restricted to 2500 samples/sec.
 
Note that on the miniperf branch, cpu-clock is the default; use "-e cycles" to use the cycle counter.  Or try something else; run "perf list" on a Linux host, or look at the table at the top of gonk-misc/miniperf/miniperf-record.c on the miniperf branch.  The hardware might not support some of them.
 
To set the sample rate, use the -F flag to specify a target frequency (samples/sec) or -c to give an absolute number of cycles (note that the kernel will adjust the CPU speed in response to demand).  The "cycle time" for cpu-clock appears to be in nanoseconds regardless of the physical timer used.  Note that, at very high rates (empirically, >10 kHz), the CPU may spend enough time gathering samples to noticeably slow down the application being profiled.
 
== In More Detail ==


The original run-perf.sh commands were "record" and "report", corresponding to those perf(1) commands, except that "record" is run on the device (and the perf.data file pulled afterwards, along with kallsyms) and "report" constructs a symlink farm to provide symbol information.  This may be useful to those who are already familiar with perf, but it's not the most obvious interface for new users.
Bad at: Requires configure option, higher overhead


However: "report" works only on Linux hosts (and not even all of those out of the box, without rebuilding the perf executable, due to library issues), and while "record" will work without "report" it's not very useful — the perf.data format is undocumented, and the code for reading and writing it is not trivial to extract from its Linux dependencies.
*Download android sdk to get systrace tool:
**[http://developer.android.com/sdk/index.html 1. download link]
**2. the systrace.py tool is at path-to-android-sdk/tools/systrace


As an alternative to this, "./run-perf.sh minirecord" accepts a subset of the perf record options (-e -c -F -m -o, and defaults to -a -g -e cpu-clock) and obtains output in a simpler format.  The actual command it runs, "miniperf-record", is built in gonk-misc with the normal Android makefiles.
*Enable systrace in B2G:
**Build with '--enable-systrace' config or just uncomment the MOZ_USE_SYSTRACE define in gecko/tools/profiler/GeckoProfilerImpl.h like:
<pre>
#define MOZ_USE_SYSTRACE
#ifdef MOZ_USE_SYSTRACE
# define ATRACE_TAG ATRACE_TAG_ALWAYS
// We need HAVE_ANDROID_OS to be defined for Trace.h.
// If its not set we will set it temporary and remove it.
# ifndef HAVE_ANDROID_OS
#  define HAVE_ANDROID_OS
#  define REMOVE_HAVE_ANDROID_OS
# endif
</pre>


The next layer is "./run-perf.sh sps", which converts the raw profile data to the format used by the Gecko profiler (with symbols). If it was collected with "./run-perf.sh record" then it needs to run perf report, in which case see above about nonportability; for miniperf it reads the file with Python code that should run everywhere.
*How to use systrace:
**[http://developer.android.com/tools/help/systrace.html systrace.py document]
**./systrace.py --time=10 -o mynewtrace.html sched


Finally, "./run-perf.sh record-sps" is just "minirecord" followed by "sps".
Note: Gecko code is tagged as ATRACE_TAG_ALWAYS, so we don't set the category type.

Latest revision as of 13:59, 1 February 2015

Profiling with the gecko profiler

Good at: Native stacks (with runtime options) + javascript profiling, low overhead sampling, familiar for gecko developers

See these instructions. Patches are in-flight to get native stacks in profiles, but that's not in default configurations yet.

Profiling with systrace

Good at: Shows process preemption, shows all calls to instrumented functions, Familiar for android developers

Bad at: Requires configure option, higher overhead

  • Download android sdk to get systrace tool:
    • 1. download link
    • 2. the systrace.py tool is at path-to-android-sdk/tools/systrace
  • Enable systrace in B2G:
    • Build with '--enable-systrace' config or just uncomment the MOZ_USE_SYSTRACE define in gecko/tools/profiler/GeckoProfilerImpl.h like:
#define MOZ_USE_SYSTRACE
#ifdef MOZ_USE_SYSTRACE
# define ATRACE_TAG ATRACE_TAG_ALWAYS
// We need HAVE_ANDROID_OS to be defined for Trace.h.
// If its not set we will set it temporary and remove it.
# ifndef HAVE_ANDROID_OS
#   define HAVE_ANDROID_OS
#   define REMOVE_HAVE_ANDROID_OS
# endif

Note: Gecko code is tagged as ATRACE_TAG_ALWAYS, so we don't set the category type.