Recently I decided to port a little package that I had to Python 3, and ran into the traceback reference cycle problem. This blog is the result of the detective work I had to do, both to re-familiarize myself with this issue (I haven’t been doing this sort of stuff for a few years) and to uncover the exact behaviour in Python 3.
Background
In Python 2, exceptions are stored internally as three separate objects: The type, the value and the traceback objects. The value is normally an instance of the type by the time your python code runs, so mostly we are dealing with value and traceback only. There are two pitfalls one should be aware of when writing exception handling code.
The traceback cycle problem
Normally, you don’t worry about the traceback object. You write code like:
def foo(): try: return bar() except Exception as e: print "got this here", e
The trouble starts when you want to do something with the traceback. This could be to log it, or maybe translate the exception to something else:
def foo(): try: return bar() except Exception as e: type, val, tb = sys.exc_info() print "got this here", e, repr(tb)
The problem is that the traceback, stored in tb, holds a reference to the execution frame of foo, which again holds the definition of tb. This is a cyclic reference and it means that both the traceback, and all the frames it contains, won’t disappear immediately.
This is the “traceback reference cycle problem” and it should be familiar to serious Python developers. It is a problem because a traceback contains a link to all the frames from the point of catching the exception to where it occurred, along with all temporary variables. The cyclic garbage collector will eventually reclaim it (if enabled) but that occurs unpredictably, and at some later point. The ensuing memory wastage may be problematic, the latency involved when gc finally runs, or it may cause problems with unittests that rely on reference counts to detect when objects die. Ideally, things should just go away when not needed anymore.
The same problem occurs whenever the traceback is present in a frame where an exception is raised, or caught. For example, this pattern here will also cause the problem in the called function translate() because tb is present in the frame where it is raised.
def translate(tp, val, tb): # translate this into a different exception and re-raise raise MyException(str(val)), None, tb
In python 2, the standard solution is to either avoid retrieving the traceback object if possible, e.g. by using
tp, val = sys.exc_info()[:2]
or by explicitly clearing it yourself and thus removing the cycle:
def translate(tp, val, tb): # translate this into a different exception and re-raise try: raise MyException(str(val)), None, tb finally: del tb
By vigorous use of try-finally, the prudent programmer avoids leaving references to traceback objects on the stack.
The lingering exception problem
A related problem is the lingering exception problem. It occurs when exceptions are caught and handled in a function that then does not exit, for example a driving loop:
def mainloop(): while True: try: do_work() except Exception as e: report_error(e)
As innocent as this code may look, it suffers from a problem: The most recently caught exception stays alive in the system. This includes its traceback, even though it is no longer used in the code. Even clearing the variable won’t help:
report_error(e) e = None
This is because of the following clause from the Python documentation:
If no expressions are present, raise re-raises the last exception that was active in the current scope.
In Python 2, the exception is kept alive internally, even after the try-except construct has been exited, as long as you don’t return from the function.
The standard solution to this, in Python 2, is to use the exc_clear() function from the sys module:
def mainloop(): while True: try: do_work() except Exception as e: report_error(e) sys.exc_clear() # clear the internal traceback
The prudent programmer liberally sprinkles sys.exc_clear() into his mainloop.
Python 3
In Python 3, two things have happened that change things a bit.
- The traceback has been rolled into the exception object
- sys.exc_clear() has been removed.
Let’s look at the implications in turn.
Exception.__traceback__
While it unquestionably makes sense to bundle the traceback with the exception instance as an attribute, it means that traceback reference cycles can become much more common. No longer is it sufficient to refrain from examining sys.exc_info(). Whenever you store an exception object in a variable local to a frame that is part of its traceback, you get a cycle. This includes both the function where the exception is raised, and where it is caught.
Code like this is suspect:
def catch(): try: result = bar() except Exception as e: result = e return result
The variable result is part of the frame that is referenced result.__traceback__ and a cycle has been created.
(Note that the variable e is not problematic. In Python 3, this variable is automatically cleared when the except clause is exited.)
similarly:
def reraise(tp, value, tb=None): if value is None: value = tp() if value.__traceback__ is not tb: raise value.with_traceback(tb) raise value
(The above code is taken from the six module)
Both of these cases can be handled with a well placed try-finally to clear the variables result, value and tb respectively:
def catch(): try: result = bar() except Exception as e: result = e try: return result finally: del result
def reraise(tp, value, tb=None): if value is None: value = tp() try: if value.__traceback__ is not tb: raise value.with_traceback(tb) raise value finally: del value, tb
Note that the caller of reraise() also has to clear his locals that he used as an argument, because the same exception is being re-raised and the caller's frame will get added to the exception:
try: reraise(*exctuple): finally: del exctuple
The lesson learned from this is the following:
Don’t store exceptions in local objects for longer than necessary. Always clear such variables when leaving the function using try-finally.
sys.exc_clear()
This method has been removed in Python 3. This is because it is no longer needed:
def mainloop(): while True: try: do_work() except Exception as e: report_error(e) assert sys.exc_info() == (None, None, None)
As long as sys.exc_info() was empty when the function was called, i.e. it was not called as part of exception handling, then the inner exception state is clear outside the Except clause.
However, if you want to hang on to an exception for some time, and are worried about reference cycles or memory usage, you have two options:
- clear the exception’s __traceback__ attribute:
e.__traceback__ = None
- use the new traceback.clear_frames() function
traceback.clear_frames(e.__traceback__)
clear_frames() was added to remove local variables from tracebacks in order to reduce their memory footprint. As a side effect, it will clear the reference cycles.
Conclusion
Exception reference cycles are a nuisance when developing robust Python applications. Python 3 has added some extra pitfalls. Even though the local variable of the except clause is automatically cleared, the user must himself clear any other variables that might contain exception objects.
I don’t understand why you would do this.
Typically one would return a value OR raise an exception.
Apologies if that was formatted poorly. There’s no preview button here to see how to do code tags.
The point of this artificial code was to demonstrate that returning an exception can also be problematic.
Sometimes you are marshalling calls across some thread boundary or other and you want to break the exception chain.
Then you might have something like the following:
and elsewhere:
In both of these places, it would be important to clear ‘result’ before leaving the frame.
excellent post, thanks a lot I had an issue with throwing a lot of exceptions in a huge loop causing memory crash
Here’s how I got to know this behiavour:
Thank you for this article!
Great article! Covers exactly what I’ve looked for. Thank you!
Watch out also that “e.__traceback__ = None” has no effect if it is followed by a bare raise. Because then __traceback__ will be added back to the original exception object. Proof:
“`
cache = {}
try:
try:
1/0
except Exception as ex:
ex.__traceback__ = None
print(ex, ex.__traceback__)
cache[1] = ex
raise
except Exception:
pass
print(cache[1], cache[1].__traceback__)
“`
To work around this, you cannot store the original exception object in the cache, you have to 1. create a copy, 2. set “__traceback__ = None” on the copy, and 3. store the copy in the cache, not the original exception.
[…] it was definitely annoying. After spending some quality time with objgraph, for some time I thought traceback reference cycles might be at fault, and I sent a number of fixes to various upstream projects for those (e.g. […]