I have been running Python-based Jupyter Notebooks for some time but never thought about using environments before quite recently. I have heard people talking about environments, but I didn’t understand why I would need it.
Two days ago, I tried to upgrade to the latest version of the Musical Gestures Toolbox for Python and got stuck in a dependency nightmare. I tried to upgrade one of the packages that choked, but that only led to other packages breaking. I suddenly also found myself in a situation where one package wanted a newer version of a dependency while another wanted an older version. My approach to installing all the packages I would ever need was not a good idea. Then I realized that this is why people use environments.
What is an environment?
According to this page, virtual Python environments date back to the early 2000s. As the number of packages grew, and their dependencies were based on various versions of other packages, things quickly became chaotic. The virtualenv tool was first developed in 2007 and was integrated as the venv module in 2013.
An environment is a “sandbox” in which one can install packages that work independently of the rest of the system. In practice, a folder on the system holds a copy of the installed packages and some settings guiding how the environment works. However, where does one start?
Which environment to use?
One of the great things about Python is that there are always many ways to do the same thing. This is both a blessing and a curse; it makes it super-flexible and powerful yet confusing for beginners and less experienced developers. I count myself in the latter group. I have worked in Python for several years but at a relatively superficial level. The fact that I have not seen the need for environments before now says it all.
In my despair, I asked on Mastodon, my current social media hangout after leaving Twitter, about which environment to use. Some people suggested venv, the standard environment solution, which is also described here. Others argued for the simplicity of poetry. I tried to follow the tutorial described here, but it failed in an early step, and I gave up.
Someone had suggested Conda, and I remembered that I actually have Anaconda installed. I have primarily thought about Conda as a package manager (thinking of it, I only use pip for installing packages, so not sure I actually need Conda). However, I came across a tutorial for setting up a Conda environment with Jupyter Notebook, which is precisely what I needed.
Getting up and running
The process was straightforward. In the terminal, I started by initiating the environment:
conda create -n environment_name
Then I activated it:
conda activate environment_name
This only creates an empty environment to be filled. I started by installing Jupyter Notebook:
conda install jupyter notebook
The instruction said that I also had to install Jupyter with pip:
pip install jupyter
I am unsure why pip is needed here, but it worked, so I leave it here for now.
Then I moved on to install the packages I needed for my project:
pip install pandas numpy scipy matplotlib musicalgestures
By installing all of them at once, pip sorts out all the dependencies for all of them automagically. I see that some people write that Conda is better at sorting out complex dependencies than pip, but for my use, this worked fine.
Finally, I could start up a Jupyter Notebook:
jupyter notebook
Since I instantiated the notebook from within the environment, I got all the dependencies right from the start.
It is possible to leave the environment again by running:
conda deactivate
That is everything needed to get going.
Share settings
One of the good things about using an environment is that it is possible to save information about installed packages and their versions:
pip freeze --local > requirements.txt
This is particularly useful when moving between computers or sharing with others. Then one can install all packages based on the requirements list:
pip install -r requirements.txt
This should make things work well.
Conclusion
For my annual #StillStanding project, I create new notebooks each day. I copy the content from one day to the next and develop the code slowly. I am experimenting with different packages and versions, and with environments, I can keep track of things when I update from one package to another. This is when environments come in handy. Many things have changed in my notebooks since I started in January. Along the way, I am also discovering and fixing bugs in the Musical Gestures Toolbox for Python. Some of the notebooks from earlier this year will not work any longer.
I was late to the party, but I have finally discovered the joys of environments. Most importantly, I have found a workflow that is both easy and manageable.