Profiling Cython extensions
Support for Python's cProfile was recently added to Cython http://hg.cython.org/cython-devel/rev/181618626844. This lets one profile code across the Python/C boundary.
Timings (2.33GHz Intel Core 2 Duo)
>>> from profile_profile import *
>>> time time_cdef_prof(10**8) CPU times: user 1.00 s, sys: 0.00 s, total: 1.00 s Wall time: 1.01 s 3.3333332833346318e+23
>>> time time_cdef_no_prof(10**8) CPU times: user 0.38 s, sys: 0.00 s, total: 0.38 s Wall time: 0.39 s 3.3333332833346318e+23
>>> time time_def_prof(10**7) CPU times: user 2.07 s, sys: 0.01 s, total: 2.08 s Wall time: 2.08 s 49999995000000.0
>>> time time_def_no_prof(10**7) CPU times: user 2.01 s, sys: 0.01 s, total: 2.02 s Wall time: 2.03 s 49999995000000.0
>>> (1.01 - 0.39) / 10**8 6.2000000000000001e-09 >>> (2.08 - 2.03) / 10**7 5.0000000000000266e-09
In each case, with profiling support compiled, but not used, the overhead was on the order of several nanoseconds, or a dozen clock cycles. If the function bodies were at all non-trivial, one probably wouldn't notice at all.
- valgrind --tool=callgrind -v --dump-instr=yes --trace-jump=yes --callgrind-out-file=callgrind.log python myscript.py kcachegrind callgrind.log
OSX itself provides very good profiling tool called Instruments.app. Valgrind now also runs under OSX, and kcachegrind can be installed via macports <http://macports.org>_.
A quick way to get profiling info is to
1. Start your long running process 2. Go to the activity monitor, select the process, and click "Sample Process" 3. In the window that pops up, select "Display: Percent of Parent"