Using an isolated python.exe

Executive Summary:

If you want to completely control the sys.path of your copy of python.exe, do the following:

  1. Create a next to it that contains nothing but the line “import sitecustomize”
  2. place a next to it, that rewrites sys.path to your liking
  3. run your python.exe, somewhat safe in the knowledge that it won’t be affected by any other python installed on the system.

The Long Explanation:

The problem

At CCP we work with many branches of many games.  All of these games use Python to some extent.  Each branch comes complete with its own python source tree, where local patches are added to Python and which may be of different python versions.  Each branch then builds its own python2x.dll and a python.exe, in addition, perhaps, to static versions of the python core for inclusion into a game executable.

What is more, each of these branches contains a plethora of offline tools.  These may be build scripts, test scripts and so on, and most of them are written in Python.  For general harmony, of course, this python version must be the same version as the one used in the game, that is, the offline tools used in a branch should use a Python version local to that branch.  This is where things become messy.

I’ve often vented my frustration to my colleagues about how “install oriented” Python appears to be.  For embedding, until recently there wasn’t even a way for a host application to completely control Python’s sys.path.  Python.exe will, when executed, go through a series of magic moves to guess an initial sys.path.  After this it will, unless instructed not to, try to import which continues with the magic path munging process.

There is one alternative behaviour built into Python (yes, built in.)  If Python upon initialization detects that it is being run from something that looks like a build folder structure, it will initialize sys.path locally to that structure.  Otherwise, it will go ahead and set sys.path to what it thinks is the system wide sensible locations before importing

This then, is how python.exe is designed.  Either it is being built and then can live in a local, isolated, setting, or it is installed and uses machine global information for its environment.

For our branch specific tools, it was important to override this behaviour.  A buildstuff.cmd batch file in /games/foo/tools ought to be able to call ../bin/python.exe and have that particular copy of python.exe set up sys.path to, say, ../src/python27/Lib.  Further, this needs to happen without the kludgy help of environment variables or, dear I say it, registry settings.  These are a nightmare to manage in a distributed environment with gazillion developer machines, build bots, and so on.

Fortunately, there is something called  The documentation specifies that after is imported, python will try to “import sitecustomize” and this can be used to do local path adjustments.

The initial approach

When we started doing this in earnest we were using Python 2.6 both as an installed “tool” on developer machines and as the Python version in use for that particular branch.  We found that by copying python.exe out of the PCBuild build folder into a game/foo/bin folder, and placing a next to it, we could fully control sys.path the way we wanted.  The our would look something like this:

import sys
localdir = "/".join(__file__.split('\')[:-1])
root = localdir + "/../"
python = root + "src/python/python27/"
sys.path = [python+"PCBuild", python+"Lib", root + "modules"]

This works because python.exe’s directory is put in sys.path by Python built-in startup magic!  Of course, there is no reliable way to find it from the .py file, so the trick with __file__ is used.  Also, we can’t use os.path for our path manipulation since we really don’t want to import it.  Who knows what side effect is may have, importing stuff from the “default” sys.path.  But it is was good enough for our purposes.  For a while.

Switching to Python 2.7

Then we moved one branch to use Python 2.7.  Python was recompiled, and python.exe put in the bin folder as before, but suddenly, some machines (particularly the build machines) started failing.  The python tools complained that site could not be imported.

On investigation it turned out that previously all machines using this scheme had, by a happy coincidence, had Python 2.6 installed on them, and our local python.exe had been importing from c:Python26Lib.  Now, python was looking for in, among other places, c:Python27Lib (one of the magic path entries set up by python.exe and which it is impossible to override.)

To fix this, I placed an empty next to and python.exe in the bin folder, hoping that would work.  Indeed, the build machines now succeeded in finding a, but now wasn’t being run.

It wasn’t until I actually looked at pythonrun.c that I realized that is being imported and executed by the in python’s standard library!.  So, it is’s responsibility to call (and something called that I had previously not known about.)

The solution then:  Instead of an empty, have a containing this code:

import sitecustomize
del sitecustomize


So, we have found that to have an isolated python.exe for which you control it’s sys.path absolutely with no external influences, you need a to override your sys.path.  But for that to run, you must provide your own, in case a system-wide isn’t found.

It is annoying that if a system-wide is found, than this is run.  There, already, you have lost some control over your python.exe.  Who knows what a malicious system administrator may do in such a file?  So even with our approach, there is no absolute control.

It’s also annoying that you need two files to accomplish this task.  It would be nice if python.exe could be instructed to not do any automatic path guessing and just take its initial sys.path from a startup file.  The fact that python.exe has a built-in mechanism to set a different initial sys.path if it detects that it is being run out of a build folder, indicates that someone sometime recognized the need for this, but only took it half the way.

My suggestion would be this:  If a (or site.pyc) is found in an initial place, e.g. next to the executable, then set the initial sys.path to only that place and skip all other magic.

This could then be used to great effect in the build system.  Instead of building into python.exe a separate folder structure for built environments, just create a in the PCBuild directory (or the equivalent platform place) and set it up there.  Similarly, a could be put in place in c:Python27bin and the entire ugly logic of initial path setting could be removed from python.exe.

As a side note, I have always thought that the initial path-guessing should be a feature of python.exe and not python27.dll as such.  python.exe should, in my opinion, call something like Py_Guess_Path() before calling Py_Initialize(), since it is a very specific behaviour of that particular embedding application.


8 thoughts on “Using an isolated python.exe

  1. As you are building a cust0m python.exe executable anyway, why not have site be a builtin module which then manages the path and other issues. You can then use some interesting tricks to even import the main on the local machine to get those local configurations. This is what we do at work. This also allows us to do things like have a top level C++ exception handler and build the python library with -fexceptions so unhandled c++ exceptions thrown in our extension libraries or custom builtins to be caught at the top level and a python stack dumped along with crash diagnostics.

    Will try to dig up the code we use to have the builtin module shadow the real python module (fun trick).

    • An interesting thought. Yes, we do make our “own” Stackless Python, but we try to modify it as little as possible so that it can still be used as a regular python. It passes all the regression tests, for example. Having a builtin would nullify that, unless that were clever enough to detect the situation and use the installed… The head starts spinning. Really, writing extra code to persuade python not to do unwanted work (computing the initial sys.path and sys.prefix and whatnot) shows that there are improvements to be made.

  2. Did you know that “python -S” will turn off site importing? And “python -ES” will also ignore PYTHONPATH?

    • Yes, and this leaves Python with the initial “magic” default sys.path, the opposite of what I’m trying to do. Achieving things with command line flags is not ideal, but I’d be willing to do that (e.g. through a Python.bat file) if needed.

  3. Hi, at first sight it seems that a combination of virtualenv + buildout could possibly help getting the environment that you want. Why you didn’t try those tools?

    • I looked at Virtualenv a few months ago. It works by duplicating an existing installed python environment into a specific place. I don’t think it is suitable for use in a source control system where depots can be checked out to any particular place. I’m also not familiar with Buildout and the internet appears to be closed today so I can’t investigate, but all of this just further illustrates my point:
      Why would you need a combination of two whole systems to do something as simple as to tell your python.exe what the initial path should be and to ignore any system-wide settings?
      Basically, all I’m trying to do is to run python in the same way that it runs out of its build folders, but with a possibly (and in fact in my case, actually) different tree structure. And it works, using the approach I use, but it is not without its flaws and is open for breakage if the system-wide does something evil (I don’t think virtualenv is immune to that either. If it is, I would want to know how.)

    • First off, I did mention that I dislike environment variables. You shouldn’t have to adjust some obscure environment variables depending on which rbanch you’re currently in. This is error prone and non-obvious. Specifying which python.exe you are calling should be clue enough.
      Second, the docs say that:

      For example, if PYTHONHOME is set to /www/python, the search path will be set to [”, ‘/www/python/lib/pythonX.Y/’, ‘/www/python/lib/pythonX.Y/plat-linux2’, …].

      Again, magic. Someone decides for me that I should put things in a place called lib/pythonX.Y.
      Then at least PYTHONPATH is slightly more helpful, because the contents of that are put in front of sys.path, and this allows me to specify where will be found.

      But, as I’ve said, relying on an environment variable to direct Python to a custom place for modules, is error prone and dangerous, as evidenced, again, by how that method is avoided when running Python out of a build tree.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s