Firefox OS/Performance/Profiling

From MozillaWiki
< Firefox OS‎ | Performance
Revision as of 02:08, 1 March 2013 by Cgj (talk | contribs)
Jump to navigation Jump to search

Profiling with the gecko profiler

See these instructions. Patches are in-flight to get native stacks in profiles, but that's not in default configurations yet.

Profiling with perf

The perf utility is a performance analysis tools for Linux.

Setup

The profiling data is collected at target device, and the report been generated at host side.
You need to install perf tool at host side, and create a directory for kernel and libraries with symbols.

  • Install perf at host side for Ubuntu
$ sudo apt-get install linux-tools
$ perf --version
perf version 3.0.17 
  • Create direcotry for libaries with symbols
    Here's a B2G makefile helper to create this directory.
$ make perf-create-symfs

Real time report

On target device, use perf top to generate and display performance counters in real time.

# perf top -p `pidof b2g`

The output will be like this:

  PerfTop:     388 irqs/sec  kernel:13.1%  exact:  0.0% [1000Hz cycles],  (target_pid: 7852)
-------------------------------------------------------------------------------

             samples  pcnt function                           DSO
             _______ _____ __________________________________ _________________

              403.00 31.8% _downsample_2x2_rgba8888           libGLESv2_mali.so
              119.00  9.4% JaegerStubVeneer                   libxul.so        
               93.00  7.3% _raw_spin_unlock_irqrestore        [kernel.kallsyms]
               59.00  4.7% _m200_texture_deinterleave_16x16_b libMali.so       
               56.00  4.4% memcpy                             libc.so          
               40.00  3.2% finish_task_switch                 [kernel.kallsyms]
               37.00  2.9% vfprintf                           libc.so          
               23.00  1.8% _gles_fb_tex_sub_image_2d          libGLESv2_mali.so
               16.00  1.3% __sfvwrite                         libc.so          
               16.00  1.3% __do_softirq                       [kernel.kallsyms]
               15.00  1.2% __memzero                          [kernel.kallsyms]
               13.00  1.0% getnstimeofday                     [kernel.kallsyms]
               12.00  0.9% _gles_generate_mipmaps_sw_16x16blo libGLESv2_mali.so
               12.00  0.9% snprintf                           libc.so          
               12.00  0.9% __divsi3                           libmozglue.so    
              10.00  0.8% v7_dma_clean_range                 [kernel.kallsyms]

Recording for a period and generating report

Record at target side: (Hit CTRL-C to stop recording)

# perf record -o /data/local/perf.data -p `pidof b2g`

Generate report at host side:

$ adb pull /data/local/perf.data .
$ perf report --symfs=/tmp/b2g_symfs_galaxys2 --vmlinux=/vmlinux

The output will be like this:

# Events: 4K cycles
#
# Overhead  Command      Shared Object                                                                                                 
# ........  .......  .................  ...............................................................................................
#
     8.00%      b2g  perf-7852.map      [.] 0x438413fc      
     4.46%      b2g  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     4.36%      b2g  [unknown]          [.] 0x43843500      
     2.61%      b2g  [kernel.kallsyms]  [k] finish_task_switch
     1.69%      b2g  libxul.so          [.] JaegerStubVeneer
     1.20%      b2g  libxul.so          [.] TypedArrayTemplate<float>::obj_getElement(JSContext*, JSObject*, JSObject*, unsigned int, J
     1.06%      b2g  libxul.so          [.] void js::mjit::stubs::SetElem<0>(js::VMFrame&)
     1.05%      b2g  libxul.so          [.] js::mjit::stubs::GetElem(js::VMFrame&)
     1.01%      b2g  libc.so            [.] pthread_mutex_lock
     1.00%      b2g  libc.so            [.] memcpy
     0.90%      b2g  libxul.so          [.] JSObject::nativeLookup(JSContext*, int)
     0.88%      b2g  [kernel.kallsyms]  [k] sub_preempt_count
     0.86%      b2g  libGLESv2_mali.so  [.] 0xa3a0          
     0.82%      b2g  [kernel.kallsyms]  [k] add_preempt_count
     0.80%      b2g  [kernel.kallsyms]  [k] __do_softirq
     0.79%      b2g  libxul.so          [.] js_IsTypedArray(JSObject*)
     0.78%      b2g  libMali.so         [.] 0x13be8         
     0.67%      b2g  libxul.so          [.] js::GetPropertyHelper(JSContext*, JSObject*, int, unsigned int, JS::Value*)
     0.66%      b2g  libxul.so          [.] js::PropertyTable::search(int, bool)
     0.66%      b2g  libxul.so          [.] js_GetProperty(JSContext*, JSObject*, JSObject*, int, JS::Value*)
     0.65%      b2g  libc.so            [.] pthread_mutex_unlock
     0.59%      b2g  libxul.so          [.] castNativeFromWrapper(JSContext*, JSObject*, unsigned int, nsISupports**, JS::Value*, XPCLa
     0.57%      b2g  libmozglue.so      [.] __udivsi3
     0.53%      b2g  libxul.so          [.] mozilla::gl::GLContextEGL::MakeCurrentImpl(bool)
     0.52%      b2g  libxul.so          [.] XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)
     0.49%      b2g  libxul.so          [.] js::TypedArray::getTypedArray(JSObject*)
     0.49%      b2g  libxul.so          [.] js::GetPropertyOperation(JSContext*, unsigned char*, JS::Value const&, JS::Value*)
     0.48%      b2g  [kernel.kallsyms]  [k] vector_swi
     0.47%      b2g  [kernel.kallsyms]  [k] get_parent_ip
     0.42%      b2g  libxul.so          [.] DisabledGetElem(js::VMFrame&, js::mjit::ic::GetElementIC*)

Recording with callgraph

Use option '-g' to do callgraph recording:

# perf record -g -o /data/local/perf.data -p `pidof b2g`

Note:

  1. To get correct call graph report, you need to compile libaries with "-fno-omit-frame-pointer".
  2. On SGS2 device, it's easy to crash when doing perf with callgraph, this is an issue to be fixed.

System-wide and specific application profiling

Use option '-a' to do system-wide profiling:

# perf record -o /data/local/perf.data -a

Profiling on specified command:

# perf -o /data/local/perf.data /system/b2g/b2g

Use option '-p' to profile an existing process: (On some devices there's no pidof, and you need to use ps to find out b2g PID)

# perf record  -o /data/local/perf.data -p `pidof b2g`

Makefile helpers for perf

Here are B2G makefile helpers to generate perf reports at host side.

  • Create direcotry for libaries with symbols
$ make perf-create-symfs
  • Remove directory for libaries with symbols
$ make perf-clean-symfs
  • Real time perf report for system wide
$ make perf-top
  • Real time report for B2G process
$ make perf-top-b2g
  • Summary perf report for system wide
$ make perf-report
  • Summary perf report for B2G process
$ make perf-report-b2g
  • Change recording duration
    For perf-report-*, it automatically records for 10 seconds then generate report. You can change it by giving argument "RECORD_DURATION".
    Below is an example to record for 30 seconds:
$ make perf-report RECORD_DURATION=30