thread profiling in Python

Python has accumulated a lot of… character over the years.  We’ve got no less then 3 profiling libraries for single threaded execution and a multi-threaded profiler with an incompatible interface (Yappi).  Since many applications use more then one thread, this can be a bit annoying.

Yappi works most of the time.  Except it can sometimes cause your application to hang for unknown reasons (I blame signals, personally). The other issue is that Yappi doesn’t have a way of collecting call-stack information. (I don’t necessarily care that memcpy takes all of the time, I want to know who called memcpy). In particular, the lovely gprof2dot can take in pstats dumps and output a very nice profile graph.

To address this for my uses, I glom together cProfile runs from multiple threads. In case it might be useful for other people I wrote a quick gist illustrating how to do it. To make it easy to drop in, I monkey-patch the Thread.run method, but you can use a more maintainable approach if you like (I create a subclass ProfileThread in my applications).


from threading import Thread
 
import cProfile
import pstats
 
def enable_thread_profiling():
  '''Monkey-patch Thread.run to enable global profiling.
  
Each thread creates a local profiler; statistics are pooled
to the global stats object on run completion.'''
  Thread.stats = None
  thread_run = Thread.run
  
  def profile_run(self):
    self._prof = cProfile.Profile()
    self._prof.enable()
    thread_run(self)
    self._prof.disable()
    
    if Thread.stats is None:
      Thread.stats = pstats.Stats(self._prof)
    else:
      Thread.stats.add(self._prof)
  
  Thread.run = profile_run
  
def get_thread_stats():
  stats = getattr(Thread, 'stats', None)
  if stats is None:
    raise ValueError, 'Thread profiling was not enabled,'
                      'or no threads finished running.'
  return stats
 
if __name__ == '__main__':
  enable_thread_profiling()
  import time
  t = Thread(target=time.sleep, args=(1,))
  t.start()
  t.join()
  
  get_thread_stats().print_stats()

Swig+Directors = Subclassing from Python!

Swig is a fabulous tool — I generally rely on it to extricate myself from the holes I’ve managed to dig myself into using C++.  Swig parses C++ code and generates wrappers for a whole bunch of target languages — I normally use it to build Python interfaces to my C++ code.

A cool feature that I’ve never made use of before is “directors” — these let you write subclasses for your C++ code in Python/(whatever language use desire).  In particular, this provides a relatively easy mechanism for writing callbacks using Python.  Here’s a quick example:

// rpc.h
class RPCHandler {
public:
void fire(const Request&, Response*) = 0;
}

class RPC {
public:
void register_handler(const std::string& name, RPCHandler*);
};

Normally, I’d make a subclass of RPCHandler in C++ and register it with my RPC server. But with SWIG, I can actually write this using Python:

class MyHandler(wrap.RPCHandler):
  def fire(req, resp):
    resp.write('Hello world!')

It’s relatively straightforward to setup. I write an interface file describing my application:

// wrap.swig
// Our output module will be called 'wrap'; enable director support.
%module(directors="1") wrap
%feature("director") RPCHandler;

// Generate wrappers for our RPC code
%include "rpc.h"

// When compiling the wrapper code, include our original header.
%{
#include "rpc.h"
%}

That’s it! Now we can run swig: swig -c++ -python -O -o wrap.cc wrap.swig

Swig will generate wrap.cc (which we compile and link into our application), and a wrap.py file, which we can use from Python.