Exception leaks in Python 2 and 3

Recently I decided to port a little package that I had to Python 3, and ran into the traceback reference cycle problem.  This blog is the result of the detective work I had to do, both to re-familiarize myself with this issue (I haven’t been doing this sort of stuff for a few years) and to uncover the exact behaviour in Python 3.

Background

In Python 2, exceptions are stored internally as three separate objects: The type, the value and the traceback objects. The value is normally an instance of the type by the time your python code runs, so mostly we are dealing with value and traceback only. There are two pitfalls one should be aware of when writing exception handling code.

The traceback cycle problem

Normally, you don’t worry about the traceback object. You write code like:

def foo():
    try:
        return bar()
    except Exception as e:
        print "got this here", e

The trouble starts when you want to do something with the traceback. This could be to log it, or maybe translate the exception to something else:

def foo():
    try:
        return bar()
    except Exception as e:
        type, val, tb = sys.exc_info()
        print "got this here", e, repr(tb)

The problem is that the traceback, stored in tb, holds a reference to the execution frame of foo, which again holds the definition of tb. This is a cyclic reference and it means that both the traceback, and all the frames it contains, won’t disappear immediately.

This is the “traceback reference cycle problem” and it should be familiar to serious Python developers. It is a problem because a traceback contains a link to all the frames from the point of catching the exception to where it occurred, along with all temporary variables. The cyclic garbage collector will eventually reclaim it (if enabled) but that occurs unpredictably, and at some later point. The ensuing memory wastage may be problematic, the latency involved when gc finally runs, or it may cause problems with unittests that rely on reference counts to detect when objects die. Ideally, things should just go away when not needed anymore.

The same problem occurs whenever the traceback is present in a frame where an exception is raised, or caught. For example, this pattern here will also cause the problem in the called function translate() because tb is present in the frame where it is raised.

def translate(tp, val, tb):
    # translate this into a different exception and re-raise
    raise MyException(str(val)), None, tb

In python 2, the standard solution is to either avoid retrieving the traceback object if possible, e.g. by using

tp, val = sys.exc_info()[:2]

or by explicitly clearing it yourself and thus removing the cycle:

def translate(tp, val, tb):
    # translate this into a different exception and re-raise
    try:
        raise MyException(str(val)), None, tb
    finally:
        del tb

By vigorous use of try-finallythe prudent programmer avoids leaving references to traceback objects on the stack.

The lingering exception problem

A related problem is the lingering exception problem. It occurs when exceptions are caught and handled in a function that then does not exit, for example a driving loop:

def mainloop():
    while True:
        try:
            do_work()
        except Exception as e:
            report_error(e)

As innocent as this code may look, it suffers from a problem: The most recently caught exception stays alive in the system. This includes its traceback, even though it is no longer used in the code. Even clearing the variable won’t help:

report_error(e)
e = None

This is because of the following clause from the Python documentation:

If no expressions are present, raise re-raises the last exception that was active in the current scope.

In Python 2, the exception is kept alive internally, even after the try-except construct has been exited, as long as you don’t return from the function.

The standard solution to this, in Python 2, is to use the exc_clear() function from the sys module:

def mainloop():
    while True:
        try:
            do_work()
        except Exception as e:
            report_error(e)
        sys.exc_clear() # clear the internal traceback

The prudent programmer liberally sprinkles sys.exc_clear() into his mainloop.

Python 3

In Python 3, two things have happened that change things a bit.

  1. The traceback has been rolled into the exception object
  2. sys.exc_clear() has been removed.

Let’s look at the implications in turn.

Exception.__traceback__

While it unquestionably makes sense to bundle the traceback with the exception instance as an attribute, it means that traceback reference cycles can become much more common. No longer is it sufficient to refrain from examining sys.exc_info(). Whenever you store an exception object in a variable local to a frame that is part of its traceback, you get a cycle. This includes both the function where the exception is raised, and where it is caught.

Code like this is suspect:

def catch():    
    try:
        result = bar()
    except Exception as e:
        result = e
    return result

The variable result is part of the frame that is referenced result.__traceback__ and a cycle has been created.

(Note that the variable e is not problematic. In Python 3, this variable is automatically cleared when the except clause is exited.)

similarly:

def reraise(tp, value, tb=None):
    if value is None:
        value = tp()
    if value.__traceback__ is not tb:
        raise value.with_traceback(tb)
    raise value

(The above code is taken from the six module)

Both of these cases can be handled with a well placed try-finally to clear the variables result, value and tb respectively:

def catch():    
    try:
        result = bar()
    except Exception as e:
        result = e
    try:
        return result
    finally:
        del result
def reraise(tp, value, tb=None):
    if value is None:
        value = tp()
    try:
        if value.__traceback__ is not tb:
            raise value.with_traceback(tb)
        raise value
    finally:
        del value, tb
Note that the caller of reraise() also has to clear his locals that he used as an argument, because the same exception is being re-raised and the caller's frame will get added to the exception:
try:
    reraise(*exctuple):
finally:
    del exctuple

The lesson learned from this is the following:

Don’t store exceptions in local objects for longer than necessary. Always clear such variables when leaving the function using try-finally.

sys.exc_clear()

This method has been removed in Python 3. This is because it is no longer needed:

def mainloop():
    while True:
        try:
            do_work()
        except Exception as e:
            report_error(e)
        assert sys.exc_info() == (None, None, None)

As long as sys.exc_info() was empty when the function was called, i.e. it was not called as part of exception handling, then the inner exception state is clear outside the Except clause.

However, if you want to hang on to an exception for some time, and are worried about reference cycles or memory usage, you have two options:

  1. clear the exception’s __traceback__ attribute:
    e.__traceback__ = None
  2. use the new traceback.clear_frames() function
    traceback.clear_frames(e.__traceback__)

clear_frames() was added to remove local variables from tracebacks in order to reduce their memory footprint. As a side effect, it will clear the reference cycles.

Conclusion

Exception reference cycles are a nuisance when developing robust Python applications. Python 3 has added some extra pitfalls. Even though the local variable of the except clause is automatically cleared, the user must himself clear any other variables that might contain exception objects.

 

 

7 thoughts on “Exception leaks in Python 2 and 3

  1. I don’t understand why you would do this.

    def catch():    
        try:
            result = bar()
        except Exception as e:
            result = e
        return result

    Typically one would return a value OR raise an exception.

    Apologies if that was formatted poorly. There’s no preview button here to see how to do code tags.

    • The point of this artificial code was to demonstrate that returning an exception can also be problematic.
      Sometimes you are marshalling calls across some thread boundary or other and you want to break the exception chain.
      Then you might have something like the following:

      def perform_work(work):
        try:
          result = True, work()
        except Exception as e:
          result = False, e
        return result
      

      and elsewhere:

      ok, result = perform_work_remotely(work)
      if ok:
        return result
      else:
        raise result

      In both of these places, it would be important to clear ‘result’ before leaving the frame.

  2. excellent post, thanks a lot I had an issue with throwing a lot of exceptions in a huge loop causing memory crash

  3. Here’s how I got to know this behiavour:

    def raise_appropriate_exception(some_val):
        if some_val == 23:
            ex = SomeException()
        else:
            ex = SomeOtherException()
    
        raise ex
    

    Thank you for this article!

  4. Watch out also that “e.__traceback__ = None” has no effect if it is followed by a bare raise. Because then __traceback__ will be added back to the original exception object. Proof:

    “`
    cache = {}

    try:
    try:
    1/0
    except Exception as ex:
    ex.__traceback__ = None
    print(ex, ex.__traceback__)
    cache[1] = ex
    raise
    except Exception:
    pass

    print(cache[1], cache[1].__traceback__)
    “`

    To work around this, you cannot store the original exception object in the cache, you have to 1. create a copy, 2. set “__traceback__ = None” on the copy, and 3. store the copy in the cache, not the original exception.

Leave a comment