Ok, I confess: As a core Python contributor, and an embedder of Stackless Python into a large application, I don’t often work much with the installed, vanilla version of python that the world at large knows and loves.
But a recent comment to one of my posts here prompted me to have a look at an off-the-shelf library to visualize graphs. I’m currently working on an idea that involves binary trees and I thought that using Python for prototyping and visualizing my problem in python instead of nitty-gritty C would make perfect sense. So this is where my journey started. This is the tale of a software engineer installing a Python application from the internet.
So, a fresh install of Python. I got the latest x64 binaries of python 2.7 from python.org and installed it. This I’ve done lots of times and it just worked. No problems there. But what about installing the software?
The project’s home page said that I should “pip install objgraph” to install the software. Knowing that this is supposed to be a command line, I went ahead and tried it, fully expecting it not to work. As it didn´t. So, something else is required.
What is this “pip” then? Googling for it turned up this page: http://pip-group.org/v2/index.html. Not what I expected. “pip python” then: http://pypi.python.org/pypi/pip. Yes, an “easy install replacement.” Seems like it fits the bill. Then how do I install it? The page doesn’t say. It says a lot about pip, but not how to install it. but there is a “Downloads” link, which I follow to find a .zip file which I download.
Now, I´m a software engineer, and a Python veteran, so I know enough to unzip this file into a temporary directory. I also find a “setup.py” file in there and so, of course, I open a command shell and type the following:
(notice how I have to specify the full path to Python. The installer doesn’t add Python to the path for me. Tedious.)
Imagine my lack of surprise when my Python complains about not finding something called “setuptools,” then.
Ok then, back to the drawing board. Obviously, “setuptools” are required to install and they appear to be yet another external package. Google brings me to this page: http://pypi.python.org/pypi/setuptools
This time, installation promises to be simple. There is a windows .exe installer. I download and run the “win32-py27.exe.” Again, I am not particularly surprised when it fails to find a python 2.7 installation. After all, I elected to install the 64 bit Python, since my computer speaks 64 bits natively. Reading through the page and grepping for “x64” led me to conclude that I should download the “source” version of setuptools. So, again, I draw on my engineering skills, download and unzip and then “c:python27python.exe tmpsetuptoolssetup.py install” as suggested for win64. And it works.
Now then, back to installing pip. This time around, “c:python27python.exe c:tmppipsetup.py install” does the trick. The script seems to indicate a successful execution.
So then, can I start to “pip install objgraph?” Nope. there is no “pip” in the path. Okay then, well, “pip” is a Python module, isn’t it? Shouldn’t I then just type “c:python27python.exe -m pip install objgraph?” No luck either. Python informs me that “pip” is a package and cannot be run like that.
Okay, I know what is going on. There is a .py script somewhere, pip.py that I need to run. The instructions are written for a Unix user that has the directory in its PATH and the file has the magic #!python in the first line. Yes, I used to be a unix guy. So, where is this file then? Looking closely, I finally find pip.exe under c:python27scripts. So, finally:
c:python27scriptspip.exe install objgraph
This works and installs the library. Great!
But what next? The objgraph page recommended i use “graphwiz” for the spiffy graphics, specifically the “xdot” package. So I install (using pip.exe) xdot, only to find that it can´t run because it is missing the “gobject” module.
And, at this point we branch off on a different Odyssey involving GTK+, PyGTK, PyCairo, PyGobject and Graphwiz that is too long and painful to recount. Suffice to say that I gave up.
There are a few main points that I’m trying to make with all this. Here they are:
- Installing python on a windows machine doesn’t add it to the PATH. Also, there are no automatic shell associations. You have to know how to run your python scripts from the command line.
- There is no simple starting point to start installing packages. At the very least one has to locate and install setuptools, which is not obvious how to do, at least if one has made the mistake of installing a 64 bit version of Python. And then one has to locate and install the friendlier “pip” installer using a lengthy command line.
- Most of the instructions encountered assume a unix environment. Perhaps the experience is smoother on unix. Perhaps not.
This wasn’t the first time I tried to install a Python package. And I admit that I am being deliberately obtuse to illustrate a point. But every time I go through these moves I am amazed at how cumbersome it is. I am a computer veteran of the Jupiter ACE era and know my onions but there are people out there that don’t and who will be turned away much earlier than I was and that is a shame.
But there is also another issue here that I find unsettling with the whole process. Even assuming that all goes well, I have the proper install tools and know how to use them: Whenever I want to go and check out some python project or other, I find that I have to install it and what is more, install all its prerequisites. And so on ad infinitum. Very soon, you find that you have installed all kinds of modules and packages, permanently modifying your python environment. There is no obvious cleanup mechanism and no way of tracking dependencies. I am loath to do this because I always have this uneasy feeling that my Python install is somehow tainted after doing this.
So, this blog is digressing into another about packaging so I will stop here and rant about that at a later date.
16 thoughts on “Installing a Python library: A traveler’s tale.”
Honestly, I think the problem you’re encountering is, more generally, a problem with any and all software on Windows (and, to a lesser extent, OS X). Linux package managers have dealt with these exact problems and come to good enough solutions. I think it’s fair to say that the majority of python developers routinely use some flavor of linux for their dev environments. And of those that don’t, virtualenv is a popular ‘solution’ to the problem of a ‘dirty’ python distribution.
I think there is a long-term dream of making a complete package manager for python eggs, but it’s far from reality.
I’m actually glad that the basic Python installer doesn’t change my PATH settings, I often find that I have to clean up my path after installing some program under Windows, and in this case, one may have more than one version of Python installed. I do agree that installing packages can be a pain.
To avoid tainted Python environments, I use virtualenv. It makes life much simpler…
@Benjamin: I concur with you: package managers are very useful, especially for installing the non-Python dependencies of some modules. Mac OS X has the very good Fink package manager (http://www.finkproject.org/); thanks to Fink, I basically never had any problem with the installation of Python modules on Mac OS X.
@Kristján: The setuptools (easy_install) thing really annoys me each time I want to install additional Python modules on a new machine. I feel like it would usefully be part of the standard Python distribution (but always thought that there may be some good reasons for why it is not found there).
I know that you don’t want to hear this, but I can’t resist to tell the same story on Ubuntu:
>> apt-get install python-objgraph
The following extra packages will be installed:
graphviz libcgraph5 libgvpr1
The following NEW packages will be installed:
graphviz libcgraph5 libgvpr1 python-objgraph
0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded.
Need to get 742kB of archives.
After this operation, 2.040kB of additional disk space will be used.
Do you want to continue [Y/n]? y
And that’s it. If I ever have the urge to remove the package, ‘apt-get purge python-objgraph’ will remove it completely. And an additional ‘apt-get autoremove’ will also remove ‘graphviz’ and the others if they are no longer needed by other packages…
The official python.org distribution is bad under windows. I gave up on it long ago and switched to ActivePython, it gets all the PATH, associations, shell stuff right and includes pip, pypm, pywin32 and some other packages preinstalled. They also build binary versions of a lot of c modules for windows so that i don’t have to do it manually (For x64 some binary packages are available only in their commercial version for some reason).
[…] This post was mentioned on Twitter by Bambang Purnomosidi and Planet Python, Python UK. Python UK said: Installing a Python library: A traveler's tale. | Kristján's … http://bit.ly/hCEeoZ […]
The worst part is that you probably would be better off installing distribute instead of setuptools. 😉
Now most of the problems you experience here is mostly a case of bad documentation. Distributes documentation is better, it tells you to download a script and run it, and that’s supposed to work on windows too (but I haven’t tried it).
The Setuptools instructions are very complex, and I’m not sure why, since it is a pure-python library that can be installed from source on all platforms. I’m extremely surprised that pip doesn’t mention that setuptools is required. The docs clearly assume you are a setuptools user already.
Also the distribute/pip functionality should be a part of Python core, and that will likely happen with distutils2 sometime in the future.
I agree that Python packaging sucks.
One thing: You complained about having to type the entire path to `python.exe` when running a `.py` script in Windows’ shell. You don’t need to– You don’t even need to write `python`, you just run the `.py` script itself as if it was an executable and Windows will automatically use your default Python for it.
I am (since six months) aware of virtualenv. The problem with this approach is that 1) it is not for your average user who is unlikely to have heard of it and 2) it is a cludge. Really, create a new root? And the way it goes about its “virtualization” is probably based on a hack. I’ve blogged about isolating python before.
Also, I think all this installing is a pretty dated concept. Remember when you could pop a floppy into your Mac containing a word processor and just run it? No installation required.
Shared libraries, a cool concept from the 80s allowing you to save on disk space, contribute to DLL hell, or more generally, versioning hell. This is also a problem with python, where two different packages may depend on different versions of a third package.
But with more memory and broader band, why bother with it? Wouldn’t it be cool if you could distribute a library, let’s say “wonderlib” and it came packaged with all its dependencies? Isolated, without affecting other code? You can do this with C, with Java (I think) and others. Running flash games in your browser or java applications never seems to require installing anything, or if it does, these things are pretty isolated.
But now I’m back talking about packaging, a flammable subject.
Ha. So true. I have spent so much time doing this sort of thing. As a result I try not to use windows when possible, I sleep much better now.
I’ve used the “official” Python installation packages for Windows (32-bit), and they do install Windows file associations. So double-clicking on a “.py” file runs it, and typing its name in the console also runs it. I can’t confirm for 64-bit.
A few random comments:
Python doesn’t add itself to your path because a lot of people who use python on Windows have multiple versions installed, and since symlinks are the way they are in the Windows world, you can’t manage this nearly as easily as you can on a Unix-y system. I have some scripts I’ve hacked together to help with this, perhaps I’ll clean them up, package them and release them.
I recommend always installing the pywin32 extensions when installing python on Windows, even if you don’t think you’ll need it. It makes your life much easier IMO.
I, too, have never gotten the Setuptools installer to work properly. I use typically Distribute now, which is a clone of setuptools, and I don’t have any trouble with. If I really need to install SetupTools for some reason, I usually use the ez_setup.py script, which still worked last time I checked.
virtualenv has more to do with running different applications that require conflicting versions of python modules than it has to do with DLLs. I see you mentioned that, but I wanted to clarify for other readers, as I had to read it twice. And while yes, it’s a pretty severe solution to the problem, the fact that it is that severe means it works.
Also, there are several ways of distributing python applications along with their dependencies: Py2exe, PyInstaller, zip files, etc. But this is the problem that SetupTools, Buildout, pip etc. were created to solve. The idea of distributing all of a app/library’s dependencies along with it doesn’t work too well in general, because then it becomes really difficult to upgrade those dependencies. If package A depends on package B, and a major security flaw is revealed in package B, you need to be able to upgrade package B. But then you have to remember to upgrade it on every machine you subsequently install package A on. It quickly becomes a mess. A large part of the python community prefers to say “install package A along with all it’s dependencies” than deal with this.
Note also that virtualenv has the ability to use “bootstrap scripts” to
Hopefully the distutils2 project (which is getting closer and closer to completion) will straighten out some of this mess. It should provide a much more sane way of handling dependencies, installing packages,etc.
Running .py files bare from the command-line requires adding “.py” to the PATHEXT environment variable. I know the python.org installer doesn’t do this.
I disagree. I’ve had far more trouble with the ActiveState distribution than I’ve ever had with the python.org one.
Those interested in package management for windows might want to check out CoAPP:
though I’m not sure anything will ever come of it.
If Setuptools (or Distribute which I haven’t heard of) is de-facto, why doesn’t it just come with the distribution? Seriously. Why let every school-teacher or well meaning python curious windows user out there get put off by this?
I see your argument about the “security flaw in package B” all the time. This is a specialization of one of the big selling points for shared libraries back in the 80s: “And if then suddenly an update is released for the widget.dll, you can install it and then both your grapher.exe and drawer.exe will transparently benefit!”. The “security flaw” argument is a subset of this, but really, how often does that happen? What security flaws are people so paranoid about that they are willing to subject themselves (and everybody else) to DLL hell just to have a more convenient way of dealing with a security problem should it arise? Is this an actual problem
(And thanks for pointing out that by using the well known phrase “DLL hell” I mean the general problem of shared libraries, be they DLLS, Python Packages, etc.)
I think we as a community should be focusing more on ease of use. And this must come as a design principle from the ground, not as something provided by third party re-packagers. Personally, I think the concept of a centrally installed python and a library repository is dated. Even if you have ~user specific library paths and what not, it is just a variation on the theme.
Look at Java. I am not a java developer, so maybe for a developer it is different, but as a user, I install a Java runtime once (one that continually reminds me to update, but that’s a different story 🙂 ) I then simply click and run java programs. Some run in the browser, some on the desktop. Never am I asked to download any additional dependencies, let alone install them. I don’t get the feeling that the downloaded programs leave some dodgy libraries lying around somewhere. And they probably don’t, or interoperability would be gone.
Python started life as a unix tool with a similar architecture too Perl. Essentially a console program with a PYTHONPATH and some .pythonrc -like options. And while out of that grew a beautiful expressive and easy to learn programming language, the implementation (of C-python) is still very much stuck with that heritage (fodder for a different blog: Why does pythonlib include so much python.exe implementation detail? Not good for embedding). And that makes Python a command line hacker’s programming language, an anachronism in the modern age of web-centered programming.
I’ve written a little tool that makes it a bit easier. The following long line sets up a virtualenv with objgraph installed into .sandbox, and has no dependencies other than Python:
c:python27python -murllib https://bitbucket.org/tlynn/sandboxer/raw/tip/sandboxer.py | c:python27python - objgraph
It’s limited but occasionally useful. The docs are at https://bitbucket.org/tlynn/sandboxer/.
Java programs for the most part, are written in java. If you include all of the class files, any java machine will run it. Much of python is actually a combination of python and c. It is easier (for a dev) to hook into something like setuptools, than to provide all the required dlls or exes for windows, appropriate source on linux, and who knows what on mac. I agree that where possible programs and libraries should include dependancies, but this often is harder than you make it sound.
It also precludes the developer using a different version of a dep than the author may have statically provided. I assure you, this does come up. Its not only the security issue. Sometimes a newer version is much more performant. Then again, in this case it is not difficult for the developer to remove the shared lib so it uses his system like normal.
[…] as a set of packages that can be used with the importing framework that EVE uses. As I have blogged about before, Python by default assumes a central package directory and doesn’t lend itself well to […]