Skip to content

conda

conda is used for "package, dependency and environment managing" that is cross platform and enables reproducibility of old code, even when some packages and dependencies have been updated in the meantime. conda can be used for any language and can even act as a manager for bioinformatic software. We will also be using it to manage our Python environments. Conda will make it easy to invoke different Python versions as needed for different tasks. Instead of installing software to the entire machine, you install it only in a contained environment. Below are instructions on installing and setting up conda, and a representative example of how virtual environments work.

Installation

There are two flavors of conda - Anaconda and Miniconda. We recommend you use the more lightweight Miniconda, which can be installed on macOS using the commands below:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O ~/miniconda.sh

bash ~/miniconda.sh -p $HOME/miniconda3

These commands will download the miniconda installer, run it, and install it to your $HOME folder by following the prompts. It will also add the path $HOME/miniconda3/bin to your lookup $PATH, which allows you to run conda executables without referring explicitly to their location.

Usage and an Example

Now that conda is installed, let's look at how virtual environments work. First, let's create a new environment that uses Python 3.9:

conda create --name test_environment python=3.9

The environment has now been created. You can see a list of all the environments that you have ever created with conda env list.

Here is a list of current environments on my machine (* marks current environment):

# conda environments:
#
base                  *  /Users/$username/miniconda3
dev                      /Users/$username/miniconda3/envs/dev
test_environment         /Users/$username/miniconda3/envs/test_environment

To activate the test_environment that you just created:

source activate test_environment

Now you are in the virtual environment, using Python 3.9 as a default. You can check to make sure this is the case with python --version.

You can also see the path to your environment's Python with which python.

Now that you are in the correct environment, you can begin adjusting it to suit your purposes. The main way to do this is by installing certain Python packages that you may use in your analysis. There are multiple ways to install a Python package. The primary means of installation conda install {package name}. If you receive a PackagesNotFoundError, search to identify other means of conda installation. This may include tapping different channels such as bioconda or conda-forge (e.g., conda install --channel bioconda {package name}).

If a package is not available from conda, you may use pip install {package name}. While conda attempts to be compatible with pip, you may occasionally run into issues with packages installed through this route.

Package installation example: As an example, we will use conda to install numpy. Numpy is a popular package used for scientific computing, typically computing that involve matrices. First make sure you are in the correct environment, then run:

conda install numpy

You should receive a message reporting the successful installation of the package. You can ensure successful installation by running Python with python and then attempting to load the package with >>> import numpy as np. If the package was successfully loaded, you will get no message. If it was unsuccessful, you will see:

 ModuleNotFoundError: No module named 'numpy'

Zamanian Lab Environments

Whenever we begin a new analysis that will include Python scripts, it is the developer's responsibility to build a conda environment that is suitable for those scripts. Instructions for creating this environment should be included in the README file on the main page of the associated GitHub repo.

Cleaning

conda environments can tend to take up a lot of disk space unless you tend to use the same package versions across environments. It's good practice to delete environments that aren't being actively used or updated (but keep notes on what packages and versions you would need to reproduce this environment). conda also has its own cleanup command that should be regularly run:

conda clean -ay

Useful Resources

conda cheatsheet