I have been struggling to get matplotlib working ever since upgrading to OSX 10.8 Mountain Lion. It’s necessary for a number of different graphing/plotting things I want to do with NLTK and Pandas. It’s very frustrating. The solutions I’ve found for the kinds of problems I’ve been having don’t ever seem to work. (For example, here, here, here, and all these too.)
I want to get this working because, in part, I want to reproduce the lessons Matt Jockers presented on text analysis in R at the Digital Humanities Winter Institute.1 The Python Data Analysis Library provides the kind of data structures needed to do the kind of analysis offered in R. At any rate, it brings me back to matplotlib again.
So, I have tried the prescribed solutions to the problem of installing in every imaginable combination– installing and uninstalling dependencies in different orders, to no avail. I was doing this using OS X’s built-in python, which with Mountain Lion is 2.7.2 and serves my purposes, generally.
I finally decided to try using a virtualenv, and magically matplotlib started working. So, here’s how I did it:
Using the package manager homebrew I installed
$ brew install libpng freetype pkg-config
Using virutalenv and the truly awesome virtualenvwrapper to set up a fresh python instance just for this kind of text analysis.
$ mkvirtualenv text_analysis
(text_analysis) $ pip install numpy
(text_analysis) $ pip install scipy
(text_analysis) $ pip install pandas
(text_analysis) $ pip install nltk
(text_analysis) $ pip install ipython
(text_analysis) $ pip install tornado
(text_analysis) $ pip install zmq
And then, finally,
(text_analysis) $ pip install matplotlib
The only package of those listed above that is required for
numpy, but the others are useful for what I’m working on.
Whereas doing this exact same thing with Mac’s built-in python produced constant Import Errors, regardless of whether I installed matplotlib with pip, with another package installer, or building from source.
I’m glad to have it working again. If you’re having trouble getting matplotlib working, then try installing in a virtualenv, and by all means use virtualenvwrapper to manage your python envs. This may well require a change in your workflow, but it will be a change for the better. It’s good working practice to isolate your development environments, and better than
sudo site package installs on your system python.
A couple of words on managing python environments:
To ensure your scripts will work with the activated env, make sure to use the correct shebang line at the top of the file:
#! /usr/bin/env python
As you get used to using envs, you may find a few standard sets of site packages you use again and again.
pipallows you to install packages from a special requirements file. While working in an environment, such as my
text_analysisenvironment, you simply use a
(text_analysis) $ pip freeze > requirements.txt
Then, in a new env you simply bootstrap with pip:
(new_env) $ pip install -r path/to/requirements.txt
If you’re interested in using ipython or ipython notebooks, you’ll also need to install
readline. The ipython folks recommend installing it with
easy_install instead of
pip, because it works better for reasons I’m not entirely sure of. The process is the same as above:
(text_analysis) $ easy_install readline
virtualenvwrapper makes it very easy to list site packages (
lssitepackages), make new envs (
mkvirtualenv), and list and switch between them (
workon). For more, see the documentation linked above.
Leave a Reply