Reference cycles with closures

Polishing our forthcoming console game, our team in Shanghai are relentlessly trying to minimize python memory use.
Today, an engineer complained to me that “cell” objects were being leaked(*).

This rang a bell with me. In 2009, I had posted about this to python-dev.
The response at the time wasn’t very sympathetic. I should be doing stuff differently or simply rely on the cyclic garbage collector and not try to be clever. Yet, as I pointed out, parts of the library are aware of the problem and do help you with these things, such as the xml.dom.minidom.unlink() method.

The data being leaked now appeared to pertain to the json module:

[2861.88] Python: 0: <bound method JSONEncoder.default of <json.encoder.JSONEncoder object at 0x12e14010>>
[2861.88] Python: 1: <bound method JSONEncoder.default of <json.encoder.JSONEncoder object at 0x12e14010>>
[2861.88] Python: 2: <bound method JSONEncoder.default of <json.encoder.JSONEncoder object at 0x12e14010>>

This prompted me to have a look in the json module, and behold, json.encoder contains this pattern:
[python]
def _make_iterencode(…)

def _iterencode(o, _current_indent_level):
if isinstance(o, basestring):
yield _encoder(o)
elif o is None:
yield ‘null’
elif o is True:
yield ‘true’
elif o is False:
yield ‘false’
elif isinstance(o, (int, long)):
yield str(o)
elif isinstance(o, float):
yield _floatstr(o)
elif isinstance(o, (list, tuple)):
for chunk in _iterencode_list(o, _current_indent_level):
yield chunk
elif isinstance(o, dict):
for chunk in _iterencode_dict(o, _current_indent_level):
yield chunk
else:
if markers is not None:
markerid = id(o)
if markerid in markers:
raise ValueError(“Circular reference detected”)
markers[markerid] = o
o = _default(o)
for chunk in _iterencode(o, _current_indent_level):
yield chunk
if markers is not None:
del markers[markerid]

return _iterencode
[/python]

The problem is this: The returned closure has a func_closure() member containing the “cell” objects, one of which points to this function. There is no way to clear the func_closure method after use. And so, iterencoding stuff using the json module causes reference cycles that persist until the next collection, possibly causing python to hang on to all the data that was supposed to be encoded and then thrown away.

Looking for a workaround, I wrote this code, emulating part of what is going on:
[python]
def itertest(o):
def listiter(l):
for i in l:
if isinstance(i, list):
chunks = listiter(i)
for i in chunks:
yield i
else:
yield i
return listiter(o)
[/python]

Testing it, confirmed the problem:

>>> import celltest
>>> l = [1, [2, 3]]
>>> import gc, celltest
>>> gc.collect()
>>> gc.set_debug(gc.DEBUG_LEAK)
>>> l = [1, [2, 3]]
>>> i = celltest.itertest(l)
>>> list(i)
[1, 2, 3]
>>> gc.collect()
gc: collectable <cell 01E96B50>
gc: collectable <function 01E97330>
gc: collectable <tuple 01E96910>
gc: collectable <cell 01E96B30>
gc: collectable <tuple 01E96950>
gc: collectable <function 01E973F0>
3

To fix this, it is necessary to clear the “cell” objects once there is no more need for them. It is not possible to do this from the outside, so how about from the inside? Changing the code to:
[python]
def itertest2(o):
def listiter(l):
for i in l:
if isinstance(i, list):
chunks = listiter(i)
for i in chunks:
yield i
else:
yield i

chunks = listiter(o)
for i in chunks:
yield i
chunks = listiter = None
[/python]
Does the trick. the function becomes a generator, yields the stuff, then cleans up:

>>> o = celltest.itertest2(l)
>>> list(o)
[1, 2, 3]
>>> gc.collect()
0

It is an unfortunate situation. The workaround requires work to be done inside the function. It would be cool if it were possible to clear the function’s closure by calling, e.g. func.close(). As it is, people have to be aware of these hidden cycles and code carfully around them.

(*) Leaking in this case means not being released immediately by reference counting but lingering. We don’t want to rely on the gc module’s quirkiness in a video game.

Update:

In my toy code, I got the semantics slightly wrong.  Actually, it is more like this:
[python]
def make_iter():
def listiter(l):
for i in l:
if isinstance(i, list):
chunks = listiter(i)
for i in chunks:
yield i
else:
yield i
return listiter

def get_iterator(data):
it = make_iter()
return it(data)
[/python]

This complicates things. Nowhere is, during iteration, any code running in the scope of make_iter that we can use to clear those locals after iteration. Everything is running in nested functions and since I am using Python 2.7 (which doesn’t have the “nonlocal” keyword) there seems to be no way to clear the outer locals from the inner functions once iteration is done.

I guess that means that I’ll have to modify this code to use class objects instead.

Also, while on the topic, I think Raymond Hettinger’s class-like objects are subject to this problem if they have any sort of mutual or recursive relationship among their “members”.
 

Finding C reference leaks using the gc module

A while back I got a defect assigned to me complaining that clicking a button on our server backend web pages caused the server to freeze.  The link was on our “python” page, containing various tools and information on the python interpreter embedded in EVE.  The link itself was, interestingly enough, called “Leaky C++”.

Looking at the source code I saw code similar to this:

import gc
def get_leaking_objects_naive():
    all = gc.get_objects()
    result = []
    find = [all]
    for i in xrange(len(all)):
        if gc.get_referrers(all[i]) == find:
            result.append(all[i])
    return result

The idea is to find all objects that do not appear to be referenced by any other object and produce a report on those objects.  The problem with this code is, however, that it is O(N^2).   When there are sufficient objects in the system, this code takes forever to run.

I disabled this code and thought nothing more of it.  But recently, it occurred to me that the idea might not be too bad and whether there might be a better way to do this.  It turns out there is.  Instead of finding the referrers in this manner, we use another gc method, gc.get_referents() that returns objects immediately referred to by an object.  This is an O(1) operation and by repeately using it and eliminating objects that are such targets we can weed out everything that has referrers.  We then  end up with the list of objects that have no referrers, in one fell O(N) swoop:

def get_leaking_objects():
    #create a dict of ids to objects
    all = dict((id(i), i) for i in gc.get_objects())

    #find all the objects that aren't referred to by any other object
    ids = set(all.keys())
    for i in all.values():
        ids.difference_update(id(j) for j in gc.get_referents(i))

    #this then is our set of objects without referrers
    return [all[i] for i in ids]

This turns out to work surprisingly well.   Combined with a object hierarchy browser, this allows us to find suspicious objects, identify them and thus home in on the C code that may be causing trouble.

There is a caveat to this, and it is that gc.get_objects() and gc.get_referents() are documented to only return objects that can be part of a reference cycle.  So your leaking strings and integers won’t show up using this tool.

update:

I just made two improvements to the code.

  1. It is faster and uses less memory to skip creating a dict out of the objects.
  2. We must make sure not to leave cyclic references lying about.  the “all” variable contains the current function frame so unless we clear it, the frame and its contents (including “all”) isn’t released immediately.
def get_leaking_objects2():
    all = gc.get_objects()
    try:
        ids = set(id(i) for i in all)
        for i in all:
            ids.difference_update(id(j) for j in gc.get_referents(i))
        #this then is our set of objects without referrers
        return [i for i in all if id(i) in ids]
    finally:
        all = i = j = None #clear cyclic references to frame