Firefox OS/Performance/Profiling: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
m (Lakrits moved page FirefoxOS/Performance/Profiling to Firefox OS/Performance/Profiling: The official spelling of "Firefox OS" leaves a space between the two parts of the name. It's easier to find a page if the spelling of its name is consistent...)
 
(14 intermediate revisions by 8 users not shown)
Line 1: Line 1:
= Profiling with the gecko profiler =
= Profiling with the gecko profiler =
Good at: Native stacks (with runtime options) + javascript profiling, low overhead sampling, familiar for gecko developers


See [https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler#Profiling_Boot_to_Gecko_%28with_a_real_device%29 these instructions].  Patches are in-flight to get native stacks in profiles, but that's not in default configurations yet.
See [https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler#Profiling_Boot_to_Gecko_%28with_a_real_device%29 these instructions].  Patches are in-flight to get native stacks in profiles, but that's not in default configurations yet.


= Profiling with perf =
= Profiling with systrace =
The perf utility is a performance analysis tools for Linux.
Good at: Shows process preemption, shows all calls to instrumented functions, Familiar for android developers
 
== Setup ==
The profiling data is collected at target device, and the report been generated at host side.<br>
You need to install perf tool at host side, and create a directory for kernel and libraries with symbols.
 
* Install perf at host side for Ubuntu
$ sudo apt-get install linux-tools
$ perf --version
perf version 3.0.17
 
* Create direcotry for libaries with symbols<br>Here's a B2G makefile helper to create this directory.
$ make perf-create-symfs
 
== Real time report ==
On target device, use perf top to generate and display performance counters in real time.
# perf top -p `pidof b2g`
The output will be like this:
  PerfTop:    388 irqs/sec  kernel:13.1%  exact:  0.0% [1000Hz cycles],  (target_pid: 7852)
-------------------------------------------------------------------------------
              samples  pcnt function                          DSO
              _______ _____ __________________________________ _________________
              403.00 31.8% _downsample_2x2_rgba8888          libGLESv2_mali.so
              119.00  9.4% JaegerStubVeneer                  libxul.so       
                93.00  7.3% _raw_spin_unlock_irqrestore        [kernel.kallsyms]
                59.00  4.7% _m200_texture_deinterleave_16x16_b libMali.so     
                56.00  4.4% memcpy                            libc.so         
                40.00  3.2% finish_task_switch                [kernel.kallsyms]
                37.00  2.9% vfprintf                          libc.so         
                23.00  1.8% _gles_fb_tex_sub_image_2d          libGLESv2_mali.so
                16.00  1.3% __sfvwrite                        libc.so         
                16.00  1.3% __do_softirq                      [kernel.kallsyms]
                15.00  1.2% __memzero                          [kernel.kallsyms]
                13.00  1.0% getnstimeofday                    [kernel.kallsyms]
                12.00  0.9% _gles_generate_mipmaps_sw_16x16blo libGLESv2_mali.so
                12.00  0.9% snprintf                          libc.so         
                12.00  0.9% __divsi3                          libmozglue.so   
              10.00  0.8% v7_dma_clean_range                [kernel.kallsyms]
 
== Recording for a period and generating report ==
Record at target side: (Hit CTRL-C to stop recording)
# perf record -o /data/local/perf.data -p `pidof b2g`
 
Generate report at host side:
$ adb pull /data/local/perf.data .
$ perf report --symfs=/tmp/b2g_symfs_galaxys2 --vmlinux=/vmlinux
The output will be like this:
# Events: 4K cycles
#
# Overhead  Command      Shared Object                                                                                               
# ........  .......  .................  ...............................................................................................
#
      8.00%      b2g  perf-7852.map      [.] 0x438413fc     
      4.46%      b2g  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
      4.36%      b2g  [unknown]          [.] 0x43843500     
      2.61%      b2g  [kernel.kallsyms]  [k] finish_task_switch
      1.69%      b2g  libxul.so          [.] JaegerStubVeneer
      1.20%      b2g  libxul.so          [.] TypedArrayTemplate<float>::obj_getElement(JSContext*, JSObject*, JSObject*, unsigned int, J
      1.06%      b2g  libxul.so          [.] void js::mjit::stubs::SetElem<0>(js::VMFrame&)
      1.05%      b2g  libxul.so          [.] js::mjit::stubs::GetElem(js::VMFrame&)
      1.01%      b2g  libc.so            [.] pthread_mutex_lock
      1.00%      b2g  libc.so            [.] memcpy
      0.90%      b2g  libxul.so          [.] JSObject::nativeLookup(JSContext*, int)
      0.88%      b2g  [kernel.kallsyms]  [k] sub_preempt_count
      0.86%      b2g  libGLESv2_mali.so  [.] 0xa3a0         
      0.82%      b2g  [kernel.kallsyms]  [k] add_preempt_count
      0.80%      b2g  [kernel.kallsyms]  [k] __do_softirq
      0.79%      b2g  libxul.so          [.] js_IsTypedArray(JSObject*)
      0.78%      b2g  libMali.so        [.] 0x13be8       
      0.67%      b2g  libxul.so          [.] js::GetPropertyHelper(JSContext*, JSObject*, int, unsigned int, JS::Value*)
      0.66%      b2g  libxul.so          [.] js::PropertyTable::search(int, bool)
      0.66%      b2g  libxul.so          [.] js_GetProperty(JSContext*, JSObject*, JSObject*, int, JS::Value*)
      0.65%      b2g  libc.so            [.] pthread_mutex_unlock
      0.59%      b2g  libxul.so          [.] castNativeFromWrapper(JSContext*, JSObject*, unsigned int, nsISupports**, JS::Value*, XPCLa
      0.57%      b2g  libmozglue.so      [.] __udivsi3
      0.53%      b2g  libxul.so          [.] mozilla::gl::GLContextEGL::MakeCurrentImpl(bool)
      0.52%      b2g  libxul.so          [.] XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)
      0.49%      b2g  libxul.so          [.] js::TypedArray::getTypedArray(JSObject*)
      0.49%      b2g  libxul.so          [.] js::GetPropertyOperation(JSContext*, unsigned char*, JS::Value const&, JS::Value*)
      0.48%      b2g  [kernel.kallsyms]  [k] vector_swi
      0.47%      b2g  [kernel.kallsyms]  [k] get_parent_ip
      0.42%      b2g  libxul.so          [.] DisabledGetElem(js::VMFrame&, js::mjit::ic::GetElementIC*)
 
== Recording with callgraph ==
 
Use option '-g' to do callgraph recording:
# perf record -g -o /data/local/perf.data -p `pidof b2g`
 
Note:
# To get correct call graph report, you need to compile libaries with "-fno-omit-frame-pointer".
# On SGS2 device, it's easy to crash when doing perf with callgraph, this is an issue to be fixed.
 
== System-wide and specific application profiling ==
 
Use option '-a' to do system-wide profiling:
# perf record -o /data/local/perf.data -a


Profiling on specified command:
Bad at: Requires configure option, higher overhead
# perf -o /data/local/perf.data /system/b2g/b2g


Use option '-p' to profile an existing process: (On some devices there's no pidof, and you need to use ps to find out b2g PID)
*Download android sdk to get systrace tool:
# perf record  -o /data/local/perf.data -p `pidof b2g`
**[http://developer.android.com/sdk/index.html 1. download link]
**2. the systrace.py tool is at path-to-android-sdk/tools/systrace


== Makefile helpers for perf ==
*Enable systrace in B2G:
**Build with '--enable-systrace' config or just uncomment the MOZ_USE_SYSTRACE define in gecko/tools/profiler/GeckoProfilerImpl.h like:
<pre>
#define MOZ_USE_SYSTRACE
#ifdef MOZ_USE_SYSTRACE
# define ATRACE_TAG ATRACE_TAG_ALWAYS
// We need HAVE_ANDROID_OS to be defined for Trace.h.
// If its not set we will set it temporary and remove it.
# ifndef HAVE_ANDROID_OS
#  define HAVE_ANDROID_OS
#  define REMOVE_HAVE_ANDROID_OS
# endif
</pre>


Here are B2G makefile helpers to generate perf reports at host side.
*How to use systrace:
**[http://developer.android.com/tools/help/systrace.html systrace.py document]
**./systrace.py --time=10 -o mynewtrace.html sched


* Create direcotry for libaries with symbols
Note: Gecko code is tagged as ATRACE_TAG_ALWAYS, so we don't set the category type.
$ make perf-create-symfs
* Remove directory for libaries with symbols
$ make perf-clean-symfs
* Real time perf report for system wide
$ make perf-top
* Real time report for B2G process
$ make perf-top-b2g
* Summary perf report for system wide
$ make perf-report
* Summary perf report for B2G process
$ make perf-report-b2g
* Change recording duration<br>For perf-report-*, it automatically records for 10 seconds then generate report. You can change it by giving argument "RECORD_DURATION".<br>Below is an example to record for 30 seconds:
$ make perf-report RECORD_DURATION=30

Latest revision as of 13:59, 1 February 2015

Profiling with the gecko profiler

Good at: Native stacks (with runtime options) + javascript profiling, low overhead sampling, familiar for gecko developers

See these instructions. Patches are in-flight to get native stacks in profiles, but that's not in default configurations yet.

Profiling with systrace

Good at: Shows process preemption, shows all calls to instrumented functions, Familiar for android developers

Bad at: Requires configure option, higher overhead

  • Download android sdk to get systrace tool:
    • 1. download link
    • 2. the systrace.py tool is at path-to-android-sdk/tools/systrace
  • Enable systrace in B2G:
    • Build with '--enable-systrace' config or just uncomment the MOZ_USE_SYSTRACE define in gecko/tools/profiler/GeckoProfilerImpl.h like:
#define MOZ_USE_SYSTRACE
#ifdef MOZ_USE_SYSTRACE
# define ATRACE_TAG ATRACE_TAG_ALWAYS
// We need HAVE_ANDROID_OS to be defined for Trace.h.
// If its not set we will set it temporary and remove it.
# ifndef HAVE_ANDROID_OS
#   define HAVE_ANDROID_OS
#   define REMOVE_HAVE_ANDROID_OS
# endif

Note: Gecko code is tagged as ATRACE_TAG_ALWAYS, so we don't set the category type.