Evaluating Nagare

Introduction

A little known feature of EVE Online, disabled in the client but very much active in the server, is a web server.  This was added early in the development, before I started on the project back in 2003.  It is the main back-end access point to the game server, used for all kinds of management, status information and debugging.

Back then, Python was much less mature as a web serving framework.  Also we initially wanted just very rudimentary functionality.  So we wrote our own web server.  It, and the site it presents, were collectively called ESP, for Eve Server Pages and over the years it has grown in features and content.  The heaviest use that it sees is as the dashboard for Game Managers, where everything a GM needs to do is done through HTML pages.  It is also one of the main tools for content authoring, where game designers access a special authoring server.  ESP presents a content authoring interface that then gets stored in the backend database.

Recently we have increasingly started to look for alternatives to our homegrown HTTP solution, though.  The main reasons are:

  1. We want to use a standard Python web framework with all the bells and whistles and support that such frameworks offer
  2. We want modern Web 2.0 features without having to write them ourselves
  3. We want something that our web developers can be familiar with already
  4. We want to share expertise between the ESP pages and other web projects run by CCP.  Confluence of synergies and all that.

Stackless Python

EVE Online is based on Stackless Python.  Embedded into the the game engine is a locally patched version of Stackless Python 2.7.  We have been using Stackless since the very beginning, the existence of Stackless being the reason we chose Python as a scripting solution for our game.

Systems like the web server have always depended heavily on the use of Stackless.  Using it we have been able to provide a blocking programming interface to IO which uses asynchronous IO behind the scenes.  This allows for a simple and intuitive programming interface with good performance and excellent scalability.  For some years now we use the in-house developed StacklessIO solution which provides an alternative implementation of the socket module.  By also providing emulation of the threading module by using tasklets instead of threads, many off the shelf components simply work out of the box.  As an example, the standard library’s xmlrpc module, itself based on the socketserver module, just works without modification.

Of course, the use of Stackless python is not limited to IO.  A lot of the complicated game logic takes advantage of its programming model, having numerous systems that run as their own tasklets to do different things.  As with IO, this allows for a more intuitive programming experience.  We can write code in an imperative manner where more traditional solutions would have to rely on event driven approaches, state machines and other patterns that are better left for computers than humans to understand.  This makes CCP very much a Stackless Python shop and we are likely to stay that way for quite a bit.

Nagare

It was therefore with a great deal of interest that we noticed the announcement of Nagare on the Stackless mailing list a few years ago.

Nagare promises a different approach to web development.  Instead of web applications that are in effect giant event handlers (responding to the HTTP connectionless requests) it allows the user to write the web applications imperatively much as one would write desktop applications.  Their web site has a very interesting demo portal with a number of applications demonstrating their paradigm, complete with running apps and source code.

This resonates well with me.  I am of the opinion that the programmer is the slowest part of software development.  Anything that a development environment can do to make a programmer able to express himself in familiar, straight-forward manner is a net win.  This is why we we use tasklet blocking tasklet IO instead of writing a game based on something as befuddling as Twisted.  And for this reason I thought it worthwhile to see if a radical, forward thinking approach to web development might be right for CCP.

Tasklet pickling

Nagare achieves its magic by using a little known Stackless feature called tasklet pickling.  A tasklet that isn’t running can have its execution state pickled and stored as binary data.  The pickle contains the execution frames, local variables and other such things.

Stackless Python contains some extensions to allow pickling of various objects whose pickling isn’t supported in regular Python.  These include frames, modules and generators among other things.

When a Nagare web application reaches the point where interaction with the user is required, its state is pickled and stored with the Nagare server, and a HTTP response sent back to the client.  When a subsequent request arrives, the tasklet is unpickled and its execution continues.  From the programmer’s point of view, a function call simply took a long time.  Behind the scenes, Stackless and Nagare were working its magic, making the inherently stateless HTTP protocol seem like smooth program flow to the application.

IO Model

In other ways, Nagare is a very traditional Python web framework.  It is based around WSGI and so uses whatever threading and IO model provided to it by a WSGI server.

Application Model

Unlike some smaller frameworks, Nagare is designed and distributed as a complete web server.  One typically installs it as the single application of a virualenv root, and then configures and runs it using a script called nagare_admin.  This is very convenient for someone simply running a web server, but it becomes less obvious how to use it as a component in a larger application.

Our tests

What we were interested in doing was to see if Nagare would work within the application that is EVE.  To do this we would have to:

  1. Extract Nagare and its dependencies as a set of packages that can be used with the importing framework that EVE uses.  As I have blogged about before, Python by default assumes a central package directory and doesn’t lend itself well to isolation by embedded applications.  We have done that with EVE, however, and would want nagare  to be an “install free” part of our application.
  2. Set up the necessary WSGI infrastructure within EVE for Nagare to run on
  3. Configure and run Nagare programmatically rather than by using the top level application scripts provided by the Nagare distribution.

I specifically didn’t intend to evaluate Nagare from the web developer’s point of view.  This is because I am not one of those and find the whole domain rather alien.  This would be a technical evaluation from an embedding point of view.

Extracting

My first attempt at setting up Nagare was to fetch the source from its repository.

 svn co svn://www.nagare.org/trunk/nagare

I then intended to fetch any dependencies in a similar manual manner.  However, I soon found that to be a long and painstaking process.  In the end I gave up and used the recommended approach:  Set up a virtualenv and use:

<NAGARE_HOME>Scriptseasy_install.exe nagare

This turned out to install a lot of stuff.  This is the basic install and a total of 13 packages were installed in addition to Nagare itself, a total of almost 12Mb. The full install of Nagare increases this to a total of 22 packages and 22Mb.

The original plan was to take this and put it in a place where EVE imports its own files from.  But because of the amount of files in question, we ended up keeping them in place, and hacking EVE to import from <NAGARE_HOME>Libsite-packages.

Setting up WSGI

Nagare requires Paste and can make use of the Paste WSGI server.   Fortunately, EVE already has a built-in WSGI service, based on Paste.  It uses the standard socket module, monkeypatched to use StacklessIO tasklet-blocking sockets.  So this part is easy.

Fitting it together

This is where it finally got tricky.  Normally, Nagare is a stand-alone application, managed with config files.  A master script, nagare_admin, then reads those config files and assembles the various Nagare components into a working application.  Unfortunately, documentation about how to do this programmatically was lacking.

However, the good people of Nagare were able to help me out with the steps I needed to take, thus freeing me from having to reverse-engineer a lot of configuration code.  What I needed to do was to create a Publisher, a Session manager and the Nagare application I need to run.  For testing purposes I just wanted to run the admin app that comes with Nagare.

After configuring EVE’s importer to find Nagare and its dependencies in its default install location, the code I ended up with was this:

#Instantiate a Publisher (WSGI application)
from nagare.publishers.common import Publisher
p = Publisher()

#Register the publisher as a WSGI app with our WSGI server
sm.StartService('WSGIService').StartServer(p.urls, 8080) #our WSGI server

#instantiate a simple session manager
from nagare.sessions.memory_sessions import SessionsWithMemoryStates
sm = SessionsWithMemoryStates()

#import the app and configure it
from nagare.admin import serve, util, admin_app
app = admin_app.app
app.set_sessions_manager(sm)

#register the app with the publisher
p.register_application('admin', 'admin', app, app)

#register static resources
def lookup(r, path=r"D:nagareLibsite-packagesnagare-0.3.0-py2.5.eggstatic"):
    return serve.get_file_from_root(path, r)
p.register_static('nagare', lookup)

This gave the expected result. Browsing to port 8080 gave this image:

So, success!  EVE was serving a Nagare app from its backend.

Conclusion

These tests showed that Nagare does indeed work as a backend webserver for EVE.  In particular, the architecture of StaclessIO sockets allows most socket-based applications to just work out of the box.

Also, because Nagare is a Python package, it is inherently programmable.  So, it is possible to configure it to be a part of a larger application, rather than the stand-alone application that it is primarily designed to be.  Using Nagare as a library and not an application, however, wasn’t well documented and I had to have some help from its friendly developers and read the source code to get it to work.

On the other hand, Nagare is a large application.  Not only is Nagare itself a substantial package, it also has a lot of external dependencies.  For an embedded application of Python, such as a computer game, this is a serious drawback.   We like to be very selective about what modules we make available within EVE.  The reasons range from the purely practical (space restraints, versioning hell, build management complexity) to externally driven issues like security and licensing.

It is for this reason that we ultimately decided that Nagare wasn’t right for us as part of the EVE backend web interface.  The backend web interface started out as a minimal HTTP server and we want to keep it as slim as possible.  We are currently in the process of picking and choosing some standard WSGI components and writing special case code for our own use.  This does, however, mean that we miss out on the cool web programming paradigm that is Nagare within EVE.

Advertisement

Evaluating Nagare

Introduction

A little known feature of EVE Online, disabled in the client but very much active in the server, is a web server.  This was added early in the development, before I started on the project back in 2003.  It is the main back-end access point to the game server, used for all kinds of management, status information and debugging.

Back then, Python was much less mature as a web serving framework.  Also we initially wanted just very rudimentary functionality.  So we wrote our own web server.  It, and the site it presents, were collectively called ESP, for Eve Server Pages and over the years it has grown in features and content.  The heaviest use that it sees is as the dashboard for Game Managers, where everything a GM needs to do is done through HTML pages.  It is also one of the main tools for content authoring, where game designers access a special authoring server.  ESP presents a content authoring interface that then gets stored in the backend database.

Recently we have increasingly started to look for alternatives to our homegrown HTTP solution, though.  The main reasons are:

  1. We want to use a standard Python web framework with all the bells and whistles and support that such frameworks offer
  2. We want modern Web 2.0 features without having to write them ourselves
  3. We want something that our web developers can be familiar with already
  4. We want to share expertise between the ESP pages and other web projects run by CCP.  Confluence of synergies and all that.

Stackless Python

EVE Online is based on Stackless Python.  Embedded into the the game engine is a locally patched version of Stackless Python 2.7.  We have been using Stackless since the very beginning, the existence of Stackless being the reason we chose Python as a scripting solution for our game.

Systems like the web server have always depended heavily on the use of Stackless.  Using it we have been able to provide a blocking programming interface to IO which uses asynchronous IO behind the scenes.  This allows for a simple and intuitive programming interface with good performance and excellent scalability.  For some years now we use the in-house developed StacklessIO solution which provides an alternative implementation of the socket module.  By also providing emulation of the threading module by using tasklets instead of threads, many off the shelf components simply work out of the box.  As an example, the standard library’s xmlrpc module, itself based on the socketserver module, just works without modification.

Of course, the use of Stackless python is not limited to IO.  A lot of the complicated game logic takes advantage of its programming model, having numerous systems that run as their own tasklets to do different things.  As with IO, this allows for a more intuitive programming experience.  We can write code in an imperative manner where more traditional solutions would have to rely on event driven approaches, state machines and other patterns that are better left for computers than humans to understand.  This makes CCP very much a Stackless Python shop and we are likely to stay that way for quite a bit.

Nagare

It was therefore with a great deal of interest that we noticed the announcement of Nagare on the Stackless mailing list a few years ago.

Nagare promises a different approach to web development.  Instead of web applications that are in effect giant event handlers (responding to the HTTP connectionless requests) it allows the user to write the web applications imperatively much as one would write desktop applications.  Their web site has a very interesting demo portal with a number of applications demonstrating their paradigm, complete with running apps and source code.

This resonates well with me.  I am of the opinion that the programmer is the slowest part of software development.  Anything that a development environment can do to make a programmer able to express himself in familiar, straight-forward manner is a net win.  This is why we we use tasklet blocking tasklet IO instead of writing a game based on something as befuddling as Twisted.  And for this reason I thought it worthwhile to see if a radical, forward thinking approach to web development might be right for CCP.

Tasklet pickling

Nagare achieves its magic by using a little known Stackless feature called tasklet pickling.  A tasklet that isn’t running can have its execution state pickled and stored as binary data.  The pickle contains the execution frames, local variables and other such things.

Stackless Python contains some extensions to allow pickling of various objects whose pickling isn’t supported in regular Python.  These include frames, modules and generators among other things.

When a Nagare web application reaches the point where interaction with the user is required, its state is pickled and stored with the Nagare server, and a HTTP response sent back to the client.  When a subsequent request arrives, the tasklet is unpickled and its execution continues.  From the programmer’s point of view, a function call simply took a long time.  Behind the scenes, Stackless and Nagare were working its magic, making the inherently stateless HTTP protocol seem like smooth program flow to the application.

IO Model

In other ways, Nagare is a very traditional Python web framework.  It is based around WSGI and so uses whatever threading and IO model provided to it by a WSGI server.

Application Model

Unlike some smaller frameworks, Nagare is designed and distributed as a complete web server.  One typically installs it as the single application of a virualenv root, and then configures and runs it using a script called nagare_admin.  This is very convenient for someone simply running a web server, but it becomes less obvious how to use it as a component in a larger application.

Our tests

What we were interested in doing was to see if Nagare would work within the application that is EVE.  To do this we would have to:

  1. Extract Nagare and its dependencies as a set of packages that can be used with the importing framework that EVE uses.  As I have blogged about before, Python by default assumes a central package directory and doesn’t lend itself well to isolation by embedded applications.  We have done that with EVE, however, and would want nagare  to be an “install free” part of our application.
  2. Set up the necessary WSGI infrastructure within EVE for Nagare to run on
  3. Configure and run Nagare programmatically rather than by using the top level application scripts provided by the Nagare distribution.

I specifically didn’t intend to evaluate Nagare from the web developer’s point of view.  This is because I am not one of those and find the whole domain rather alien.  This would be a technical evaluation from an embedding point of view.

Extracting

My first attempt at setting up Nagare was to fetch the source from its repository.

 svn co svn://www.nagare.org/trunk/nagare

I then intended to fetch any dependencies in a similar manual manner.  However, I soon found that to be a long and painstaking process.  In the end I gave up and used the recommended approach:  Set up a virtualenv and use:

<NAGARE_HOME>Scriptseasy_install.exe nagare

This turned out to install a lot of stuff.  This is the basic install and a total of 13 packages were installed in addition to Nagare itself, a total of almost 12Mb. The full install of Nagare increases this to a total of 22 packages and 22Mb.

The original plan was to take this and put it in a place where EVE imports its own files from.  But because of the amount of files in question, we ended up keeping them in place, and hacking EVE to import from <NAGARE_HOME>Libsite-packages.

Setting up WSGI

Nagare requires Paste and can make use of the Paste WSGI server.   Fortunately, EVE already has a built-in WSGI service, based on Paste.  It uses the standard socket module, monkeypatched to use StacklessIO tasklet-blocking sockets.  So this part is easy.

Fitting it together

This is where it finally got tricky.  Normally, Nagare is a stand-alone application, managed with config files.  A master script, nagare_admin, then reads those config files and assembles the various Nagare components into a working application.  Unfortunately, documentation about how to do this programmatically was lacking.

However, the good people of Nagare were able to help me out with the steps I needed to take, thus freeing me from having to reverse-engineer a lot of configuration code.  What I needed to do was to create a Publisher, a Session manager and the Nagare application I need to run.  For testing purposes I just wanted to run the admin app that comes with Nagare.

After configuring EVE’s importer to find Nagare and its dependencies in its default install location, the code I ended up with was this:

#Instantiate a Publisher (WSGI application)
from nagare.publishers.common import Publisher
p = Publisher()

#Register the publisher as a WSGI app with our WSGI server
sm.StartService('WSGIService').StartServer(p.urls, 8080) #our WSGI server

#instantiate a simple session manager
from nagare.sessions.memory_sessions import SessionsWithMemoryStates
sm = SessionsWithMemoryStates()

#import the app and configure it
from nagare.admin import serve, util, admin_app
app = admin_app.app
app.set_sessions_manager(sm)

#register the app with the publisher
p.register_application('admin', 'admin', app, app)

#register static resources
def lookup(r, path=r"D:nagareLibsite-packagesnagare-0.3.0-py2.5.eggstatic"):
    return serve.get_file_from_root(path, r)
p.register_static('nagare', lookup)

This gave the expected result. Browsing to port 8080 gave this image:

So, success!  EVE was serving a Nagare app from its backend.

Conclusion

These tests showed that Nagare does indeed work as a backend webserver for EVE.  In particular, the architecture of StaclessIO sockets allows most socket-based applications to just work out of the box.

Also, because Nagare is a Python package, it is inherently programmable.  So, it is possible to configure it to be a part of a larger application, rather than the stand-alone application that it is primarily designed to be.  Using Nagare as a library and not an application, however, wasn’t well documented and I had to have some help from its friendly developers and read the source code to get it to work.

On the other hand, Nagare is a large application.  Not only is Nagare itself a substantial package, it also has a lot of external dependencies.  For an embedded application of Python, such as a computer game, this is a serious drawback.   We like to be very selective about what modules we make available within EVE.  The reasons range from the purely practical (space restraints, versioning hell, build management complexity) to externally driven issues like security and licensing.

It is for this reason that we ultimately decided that Nagare wasn’t right for us as part of the EVE backend web interface.  The backend web interface started out as a minimal HTTP server and we want to keep it as slim as possible.  We are currently in the process of picking and choosing some standard WSGI components and writing special case code for our own use.  This does, however, mean that we miss out on the cool web programming paradigm that is Nagare within EVE.

_ssl modifications in 2.7

I recently pushed into action a plan that had been brewing for long while: To make the SSL module in the standard library work for our StacklessIO socket library.

The problem with _ssl is that it just uses internally an native socket BIO.  It doesn’t allow the application to control details of the communications.  This is unsuitable for anyone that is using a non-standard socket implementation.

I admit that I didn’t look around, otherwise I probably would have stumbled across pyOpenSSL I simply assumed that the standard library ssl was the only one and that was clearly unsuitable since it does its own socket IO.

So, anyway, what I did was to add two different features to _ssl.sslwrap():

  1. Instead of only allowing wrapping a socket, one can pass in a Python object. This is assumed to be a PyBIO object, an object that implements the read() and write() methods. the SSLContext object thus created will internally create a BIO pair object and then any “read” or “write” calls on it may invoke corresponding Python callbacks to the PyBIO object to satisfy the SSL context’s need to send or receive data.
    This feature can be used to send the data over whatever Python IO channel one chooses to implement in Python.
  2. Additionally, the user may wrap None. In this case, a “naked” SSLContext will be created, with a BIO pair. This is then suitable for use by any other C code that knows about Open SSL, the layout of PySSLObject and how to pump data back and forth out of a BIO pair. To facilitate this, I export the _ssl api in the same way that _socket does (to _ssl), using a new function PySSLModule_ImportModuleAndAPI(void).
    The purpose of this feature is to be able to use standard Python code to manage certificates and set up the SSL contex, then hand off this context to a custom transport layer written in C/C++.

This is now complete. The changes were not large. I have written extenstions to the test_ssl.py unittests that wrap a regular socket in a PyBIO and the entire thing works.

We are now using this in EVE, to add SSL support to the backend webserver. We need to pipe data trough the _ssl module and back into our code where stackless stack swapping magic happens to emulate blocking IO.

The question that remains, is this: I´d like to contribute this stuff back, but due to Python 2.x being feature frozen, there is no good place for it. Maybe I could push this into 3.x but I haven’t looked. Maybe it does ssl differently.

So it remains, like many other goodies, part of the custom branch of stackless 2.7 that CCP uses. For the time being.

Installing a Python library: A traveler’s tale.

Ok, I confess: As a core Python contributor, and an embedder of Stackless Python into a large application, I don’t often work much with the installed, vanilla version of python that the world at large knows and loves.

But a recent comment to one of my posts here prompted me to have a look at an off-the-shelf library to visualize graphs.  I’m currently working on an idea that involves binary trees and I thought that using Python for prototyping and visualizing my problem in python instead of nitty-gritty C would make perfect sense.  So this is where my journey started.  This is the tale of a software engineer installing a Python application from the internet.

So, a fresh install of Python.  I got the latest x64 binaries of python 2.7 from python.org and installed it.  This I’ve done lots of times and it just worked.  No problems there.  But what about installing the software?

The project’s home page said that I should “pip install objgraph” to install the software.  Knowing that this is supposed to be a command line, I went ahead and tried it, fully expecting it not to work.  As it didn´t.  So, something else is required.

What is this “pip” then?  Googling for it turned up this page: http://pip-group.org/v2/index.html.  Not what I expected.  “pip python” then:  http://pypi.python.org/pypi/pip.  Yes, an “easy install replacement.”  Seems like it fits the bill.  Then how do I install it?  The page doesn’t say.  It says a lot about pip, but not how to install it. but there is a “Downloads” link, which I follow to find a .zip file which I download.

Now, I´m a software engineer, and a Python veteran, so I know enough to unzip this file into a temporary directory.  I also find a “setup.py” file in there and so, of course, I open a command shell and type the following:

c:python27python.exe tmppipsetup.py

(notice how I have to specify the full path to Python.  The installer doesn’t add Python to the path for me.  Tedious.)

Imagine my lack of surprise when my Python complains about not finding something called “setuptools,” then.

Ok then, back to the drawing board.  Obviously, “setuptools” are required to install and they appear to be yet another external package.  Google brings me to this page: http://pypi.python.org/pypi/setuptools

This time, installation promises to be simple.  There is a windows .exe installer.  I download and run the “win32-py27.exe.”  Again, I am not particularly surprised when it fails to find a python 2.7 installation.  After all, I elected to install the 64 bit Python, since my computer speaks 64 bits natively.   Reading through the page and grepping for “x64” led me to conclude that I should download the “source” version of setuptools.  So, again, I draw on my engineering skills, download and unzip and then “c:python27python.exe tmpsetuptoolssetup.py install” as suggested for win64.  And it works.

Now then, back to installing pip.  This time around, “c:python27python.exe c:tmppipsetup.py install” does the trick.  The script seems to indicate a successful execution.

So then, can I start to “pip install objgraph?”  Nope.  there is no “pip” in the path.  Okay then, well, “pip” is a Python module, isn’t it?  Shouldn’t I then just type “c:python27python.exe -m pip install objgraph?”  No luck either.  Python informs me that “pip” is a package and cannot be run like that.

Okay, I know what is going on.  There is a .py script somewhere, pip.py that I need to run.  The instructions are written for a Unix user that has the directory in its PATH and the file has the magic #!python in the first line.  Yes, I used to be a unix guy.  So, where is this file then?  Looking closely, I finally find pip.exe under c:python27scripts.  So, finally:

c:python27scriptspip.exe install objgraph

This works and installs the library.  Great!

But what next?  The objgraph page recommended i use “graphwiz” for the spiffy graphics, specifically the “xdot” package.   So I install (using pip.exe) xdot, only to find that it can´t run because it is missing the “gobject” module.

And, at this point we branch off on a different Odyssey involving GTK+, PyGTK, PyCairo, PyGobject and Graphwiz that is too long and painful to recount.  Suffice to say that I gave up.

There are a few main points that I’m trying to make with all this.  Here they are:

  1. Installing python on a windows machine doesn’t add it to the PATH.  Also, there are no automatic shell associations.  You have to know how to run your python scripts from the command line.
  2. There is no simple starting point to start installing packages.  At the very least one has to locate and install setuptools, which is not obvious how to do, at least if one has made the mistake of installing a 64 bit version of Python.  And then one has to locate and install the friendlier “pip” installer using a lengthy command line.
  3. Most of the instructions encountered assume a unix environment.  Perhaps the experience is smoother on unix.  Perhaps not.

This wasn’t the first time I tried to install a Python package.  And I admit that I am being deliberately obtuse to illustrate a point.  But every time I go through these moves I am amazed at how cumbersome it is.  I am a computer veteran of the Jupiter ACE era and know my onions but there are people out there that don’t and who will be turned away much earlier than I was and that is a shame.

But there is also another issue here that I find unsettling with the whole process.  Even assuming that all goes well, I have the proper install tools and know how to use them:  Whenever I want to go and check out some python project or other, I find that I have to install it and what is more, install all its prerequisites.  And so on ad infinitum.  Very soon, you find that you have installed all kinds of modules and packages, permanently modifying your python environment.  There is no obvious cleanup mechanism and no way of tracking dependencies.  I am loath to do this because I always have this uneasy feeling that my Python install is somehow tainted after doing this.

So, this blog is digressing into another about packaging so I will stop here and rant about that at a later date.

Installing a Python library: A traveler’s tale.

Ok, I confess: As a core Python contributor, and an embedder of Stackless Python into a large application, I don’t often work much with the installed, vanilla version of python that the world at large knows and loves.

But a recent comment to one of my posts here prompted me to have a look at an off-the-shelf library to visualize graphs.  I’m currently working on an idea that involves binary trees and I thought that using Python for prototyping and visualizing my problem in python instead of nitty-gritty C would make perfect sense.  So this is where my journey started.  This is the tale of a software engineer installing a Python application from the internet.

So, a fresh install of Python.  I got the latest x64 binaries of python 2.7 from python.org and installed it.  This I’ve done lots of times and it just worked.  No problems there.  But what about installing the software?

The project’s home page said that I should “pip install objgraph” to install the software.  Knowing that this is supposed to be a command line, I went ahead and tried it, fully expecting it not to work.  As it didn´t.  So, something else is required.

What is this “pip” then?  Googling for it turned up this page: http://pip-group.org/v2/index.html.  Not what I expected.  “pip python” then:  http://pypi.python.org/pypi/pip.  Yes, an “easy install replacement.”  Seems like it fits the bill.  Then how do I install it?  The page doesn’t say.  It says a lot about pip, but not how to install it. but there is a “Downloads” link, which I follow to find a .zip file which I download.

Now, I´m a software engineer, and a Python veteran, so I know enough to unzip this file into a temporary directory.  I also find a “setup.py” file in there and so, of course, I open a command shell and type the following:

c:python27python.exe tmppipsetup.py

(notice how I have to specify the full path to Python.  The installer doesn’t add Python to the path for me.  Tedious.)

Imagine my lack of surprise when my Python complains about not finding something called “setuptools,” then.

Ok then, back to the drawing board.  Obviously, “setuptools” are required to install and they appear to be yet another external package.  Google brings me to this page: http://pypi.python.org/pypi/setuptools

This time, installation promises to be simple.  There is a windows .exe installer.  I download and run the “win32-py27.exe.”  Again, I am not particularly surprised when it fails to find a python 2.7 installation.  After all, I elected to install the 64 bit Python, since my computer speaks 64 bits natively.   Reading through the page and grepping for “x64” led me to conclude that I should download the “source” version of setuptools.  So, again, I draw on my engineering skills, download and unzip and then “c:python27python.exe tmpsetuptoolssetup.py install” as suggested for win64.  And it works.

Now then, back to installing pip.  This time around, “c:python27python.exe c:tmppipsetup.py install” does the trick.  The script seems to indicate a successful execution.

So then, can I start to “pip install objgraph?”  Nope.  there is no “pip” in the path.  Okay then, well, “pip” is a Python module, isn’t it?  Shouldn’t I then just type “c:python27python.exe -m pip install objgraph?”  No luck either.  Python informs me that “pip” is a package and cannot be run like that.

Okay, I know what is going on.  There is a .py script somewhere, pip.py that I need to run.  The instructions are written for a Unix user that has the directory in its PATH and the file has the magic #!python in the first line.  Yes, I used to be a unix guy.  So, where is this file then?  Looking closely, I finally find pip.exe under c:python27scripts.  So, finally:

c:python27scriptspip.exe install objgraph

This works and installs the library.  Great!

But what next?  The objgraph page recommended i use “graphwiz” for the spiffy graphics, specifically the “xdot” package.   So I install (using pip.exe) xdot, only to find that it can´t run because it is missing the “gobject” module.

And, at this point we branch off on a different Odyssey involving GTK+, PyGTK, PyCairo, PyGobject and Graphwiz that is too long and painful to recount.  Suffice to say that I gave up.

There are a few main points that I’m trying to make with all this.  Here they are:

  1. Installing python on a windows machine doesn’t add it to the PATH.  Also, there are no automatic shell associations.  You have to know how to run your python scripts from the command line.
  2. There is no simple starting point to start installing packages.  At the very least one has to locate and install setuptools, which is not obvious how to do, at least if one has made the mistake of installing a 64 bit version of Python.  And then one has to locate and install the friendlier “pip” installer using a lengthy command line.
  3. Most of the instructions encountered assume a unix environment.  Perhaps the experience is smoother on unix.  Perhaps not.

This wasn’t the first time I tried to install a Python package.  And I admit that I am being deliberately obtuse to illustrate a point.  But every time I go through these moves I am amazed at how cumbersome it is.  I am a computer veteran of the Jupiter ACE era and know my onions but there are people out there that don’t and who will be turned away much earlier than I was and that is a shame.

But there is also another issue here that I find unsettling with the whole process.  Even assuming that all goes well, I have the proper install tools and know how to use them:  Whenever I want to go and check out some python project or other, I find that I have to install it and what is more, install all its prerequisites.  And so on ad infinitum.  Very soon, you find that you have installed all kinds of modules and packages, permanently modifying your python environment.  There is no obvious cleanup mechanism and no way of tracking dependencies.  I am loath to do this because I always have this uneasy feeling that my Python install is somehow tainted after doing this.

So, this blog is digressing into another about packaging so I will stop here and rant about that at a later date.

Finding C reference leaks using the gc module

A while back I got a defect assigned to me complaining that clicking a button on our server backend web pages caused the server to freeze.  The link was on our “python” page, containing various tools and information on the python interpreter embedded in EVE.  The link itself was, interestingly enough, called “Leaky C++”.

Looking at the source code I saw code similar to this:

import gc
def get_leaking_objects_naive():
    all = gc.get_objects()
    result = []
    find = [all]
    for i in xrange(len(all)):
        if gc.get_referrers(all[i]) == find:
            result.append(all[i])
    return result

The idea is to find all objects that do not appear to be referenced by any other object and produce a report on those objects.  The problem with this code is, however, that it is O(N^2).   When there are sufficient objects in the system, this code takes forever to run.

I disabled this code and thought nothing more of it.  But recently, it occurred to me that the idea might not be too bad and whether there might be a better way to do this.  It turns out there is.  Instead of finding the referrers in this manner, we use another gc method, gc.get_referents() that returns objects immediately referred to by an object.  This is an O(1) operation and by repeately using it and eliminating objects that are such targets we can weed out everything that has referrers.  We then  end up with the list of objects that have no referrers, in one fell O(N) swoop:

def get_leaking_objects():
    #create a dict of ids to objects
    all = dict((id(i), i) for i in gc.get_objects())

    #find all the objects that aren't referred to by any other object
    ids = set(all.keys())
    for i in all.values():
        ids.difference_update(id(j) for j in gc.get_referents(i))

    #this then is our set of objects without referrers
    return [all[i] for i in ids]

This turns out to work surprisingly well.   Combined with a object hierarchy browser, this allows us to find suspicious objects, identify them and thus home in on the C code that may be causing trouble.

There is a caveat to this, and it is that gc.get_objects() and gc.get_referents() are documented to only return objects that can be part of a reference cycle.  So your leaking strings and integers won’t show up using this tool.

update:

I just made two improvements to the code.

  1. It is faster and uses less memory to skip creating a dict out of the objects.
  2. We must make sure not to leave cyclic references lying about.  the “all” variable contains the current function frame so unless we clear it, the frame and its contents (including “all”) isn’t released immediately.
def get_leaking_objects2():
    all = gc.get_objects()
    try:
        ids = set(id(i) for i in all)
        for i in all:
            ids.difference_update(id(j) for j in gc.get_referents(i))
        #this then is our set of objects without referrers
        return [i for i in all if id(i) in ids]
    finally:
        all = i = j = None #clear cyclic references to frame

Using an isolated python.exe

Executive Summary:

If you want to completely control the sys.path of your copy of python.exe, do the following:

  1. Create a site.py next to it that contains nothing but the line “import sitecustomize”
  2. place a sitecustomize.py next to it, that rewrites sys.path to your liking
  3. run your python.exe, somewhat safe in the knowledge that it won’t be affected by any other python installed on the system.

The Long Explanation:

The problem

At CCP we work with many branches of many games.  All of these games use Python to some extent.  Each branch comes complete with its own python source tree, where local patches are added to Python and which may be of different python versions.  Each branch then builds its own python2x.dll and a python.exe, in addition, perhaps, to static versions of the python core for inclusion into a game executable.

What is more, each of these branches contains a plethora of offline tools.  These may be build scripts, test scripts and so on, and most of them are written in Python.  For general harmony, of course, this python version must be the same version as the one used in the game, that is, the offline tools used in a branch should use a Python version local to that branch.  This is where things become messy.

I’ve often vented my frustration to my colleagues about how “install oriented” Python appears to be.  For embedding, until recently there wasn’t even a way for a host application to completely control Python’s sys.path.  Python.exe will, when executed, go through a series of magic moves to guess an initial sys.path.  After this it will, unless instructed not to, try to import site.py which continues with the magic path munging process.

There is one alternative behaviour built into Python (yes, built in.)  If Python upon initialization detects that it is being run from something that looks like a build folder structure, it will initialize sys.path locally to that structure.  Otherwise, it will go ahead and set sys.path to what it thinks is the system wide sensible locations before importing site.py.

This then, is how python.exe is designed.  Either it is being built and then can live in a local, isolated, setting, or it is installed and uses machine global information for its environment.

For our branch specific tools, it was important to override this behaviour.  A buildstuff.cmd batch file in /games/foo/tools ought to be able to call ../bin/python.exe and have that particular copy of python.exe set up sys.path to, say, ../src/python27/Lib.  Further, this needs to happen without the kludgy help of environment variables or, dear I say it, registry settings.  These are a nightmare to manage in a distributed environment with gazillion developer machines, build bots, and so on.

Fortunately, there is something called sitecustomize.py.  The documentation specifies that after site.py is imported, python will try to “import sitecustomize” and this can be used to do local path adjustments.

The initial approach

When we started doing this in earnest we were using Python 2.6 both as an installed “tool” on developer machines and as the Python version in use for that particular branch.  We found that by copying python.exe out of the PCBuild build folder into a game/foo/bin folder, and placing a sitecustomize.py next to it, we could fully control sys.path the way we wanted.  The our sitecustomize.py would look something like this:

import sys
localdir = "/".join(__file__.split('\')[:-1])
root = localdir + "/../"
python = root + "src/python/python27/"
sys.path = [python+"PCBuild", python+"Lib", root + "modules"]

This works because python.exe’s directory is put in sys.path by Python built-in startup magic!  Of course, there is no reliable way to find it from the .py file, so the trick with __file__ is used.  Also, we can’t use os.path for our path manipulation since we really don’t want to import it.  Who knows what side effect is may have, importing stuff from the “default” sys.path.  But it is was good enough for our purposes.  For a while.

Switching to Python 2.7

Then we moved one branch to use Python 2.7.  Python was recompiled, and python.exe put in the bin folder as before, but suddenly, some machines (particularly the build machines) started failing.  The python tools complained that site could not be imported.

On investigation it turned out that previously all machines using this scheme had, by a happy coincidence, had Python 2.6 installed on them, and our local python.exe had been importing site.py from c:Python26Lib.  Now, python was looking for site.py in, among other places, c:Python27Lib (one of the magic path entries set up by python.exe and which it is impossible to override.)

To fix this, I placed an empty site.py next to sitecustomize.py and python.exe in the bin folder, hoping that would work.  Indeed, the build machines now succeeded in finding a site.py, but now sitecustomize.py wasn’t being run.

It wasn’t until I actually looked at pythonrun.c that I realized that sitecustomize.py is being imported and executed by the site.py in python’s standard library!.  So, it is site.py’s responsibility to call sitecustomize.py (and something called usercustomize.py that I had previously not known about.)

The solution then:  Instead of an empty site.py, have a site.py containing this code:

import sitecustomize
del sitecustomize

Musings

So, we have found that to have an isolated python.exe for which you control it’s sys.path absolutely with no external influences, you need a sitecustomize.py to override your sys.path.  But for that to run, you must provide your own site.py, in case a system-wide site.py isn’t found.

It is annoying that if a system-wide site.py is found, than this is run.  There, already, you have lost some control over your python.exe.  Who knows what a malicious system administrator may do in such a file?  So even with our approach, there is no absolute control.

It’s also annoying that you need two files to accomplish this task.  It would be nice if python.exe could be instructed to not do any automatic path guessing and just take its initial sys.path from a startup file.  The fact that python.exe has a built-in mechanism to set a different initial sys.path if it detects that it is being run out of a build folder, indicates that someone sometime recognized the need for this, but only took it half the way.

My suggestion would be this:  If a site.py (or site.pyc) is found in an initial place, e.g. next to the executable, then set the initial sys.path to only that place and skip all other magic.

This could then be used to great effect in the build system.  Instead of building into python.exe a separate folder structure for built environments, just create a site.py in the PCBuild directory (or the equivalent platform place) and set it up there.  Similarly, a site.py could be put in place in c:Python27bin and the entire ugly logic of initial path setting could be removed from python.exe.

As a side note, I have always thought that the initial path-guessing should be a feature of python.exe and not python27.dll as such.  python.exe should, in my opinion, call something like Py_Guess_Path() before calling Py_Initialize(), since it is a very specific behaviour of that particular embedding application.

selectmodule on PS3

As part of a game that we are developing on the Sony PS3, we are porting Stackless Python 2.7 to run on it.  What would be more natural?

Python is to do what it does best, act as a supervising puppetmaster, running high-level logic that you don’t want to code using something primitive such as C++.

While there are many issues that we have come across and addressed (most are minor), I’m just going to mention a particular issue with sockets.

An important role of Python will be to manage network connection using sockets.  We are using the Stacklesssocket module with Stackless Python do do this.  For this to work, we need not only the socketmodule but also the selectmodule.

The PS3 networking api is mostly BSD compliant with a few quirks.  Porting these modules to use it is mostly a straightforward affair of providing #defines for some API calls that have different names, dealing with APIs that are missing (return NotImplementedError and such) and mapping error values to something sensible.  For the most part, it behaves exactly as you would expect.

We ran into one error though, prompting me to write special handling code for select() and poll().  the Sony implementation of these functions not only returns with an error if they themselves fail (which would be serious) but they also indicate a error return if a socket error is pending on one of the sockets.  For example, if a socket receives a ECONNRESET, while you are waiting for data to arrive, select() will return with an ECONNRESET error indicator.  Not what you would expect.

The workaround is to simply filter the error values from select() and poll() and ignore the unexpected socket errors.  Rather, such a return must be considered a successful select()/poll() and for the latter function, ‘n’, the number of valid file descriptors, having been set as -1, must be recreated by walking the list of file descriptors.

Embedding Python and sys.path

When embedding a Python interpreter in another application, such as a game, a tricky parts is to get it to start up and find the right modules.  You typically want Python to:

  • Find your version of the standard library
  • Find your application modules
  • Not be confused by any other Python system that may be installed on the machine.

The application has absolute knowledge of the location of the python modules.  It has been installed in a known location and it knows where its resources are.

Unfortunately, Python often appears to be designed for the specific needs of python.exe, the “canonical embedding application”.  When one calls Py_Initialize(), a number of magic steps are invoked whereby the Python runtime tries to guess the initial setting for sys.path.  This may involve finding the program location, looking at the PYTHONPATH environment variable, and so on.  When Py_Initialize() returns, it may be too late to modify sys.path because in the mean time, module imports could have occured, such as that of copy_reg.

I have found that circumventing this involves patching pythoncore itself.  What we do in EVE before calling Py_Initialize() is to call a custom API, Py_SetPath() with an application-crafted semicolon-separated path string.  This will turn off any path-guessing in pythoncore and gives us absolute control.

Before we started doing this, for example, we could not encourage developers to have Python installed on their workstations, for example.  We would invariably run into conflicts where the embedded python picked up modules from the installed python and mysterious things would happen.  Now there is no need, and the embedded Python and the installed Python can coexsist without conflict.

I’ve recently contributed this patch to python.org.