Conda vs Pip: Choosing your Python package manager

Choosing Your Python Package Manager Conda Vs Pip

Conda vs Pip are sometimes interchangeably in our day to day use of Python. PIP and conda are very popular package managers for Python.  Although some of the functionality of these two tools overlap, they were designed and should be used for different purposes. Here is a table representing a comprehensive set of differences between conda and pip.

Conda vs Pip – Quick Comparison

Point of DifferencepipConda
Multi-Language DependencyNot SupportedSupported
Package InstallationBuild on wheelsDownload binary
Package Availability235,000 packages1,500+ pacakges
Dependency ManagementNo SAT testperforms SAT test
Virtual Environment ManagementNo in-built Virutal managementIn-built virtual management system
Minimalisticyesno
Table 1: Key summary of differences between pip and conda

Before we go on to learn more about the differences between these two package managers, let us know some basic information about pip and conda.

What is PIP?

Pip a simple command-line tool for installing python packages. It is the standard and the recommended way of installing packages from all the published python packages in the Python Package Index(PyPI). pip is already installed if you are using Python 3.4(or higher) and is downloaded from python.org or if you are working in a Virtual Environment created by virtualenv or venv.

What is Conda?

Conda is a package, dependency, and environment management system which was originally developed for Python but was later extended for use with languages like Python, R, Java, Scala, FORTRAN, C/C++, etc. It provides an easy way of installing, updating, and removing packages and handling dependencies. In its default configuration, conda installs packages from the Official Conda repository instead of the standard language-specific repositories.

Now that we have some basic idea of the two package management systems, we will be looking at the important differences between these two managers, that makes them what they are:

1. Handling of non-Python Dependencies

As we have learned earlier Conda supports languages other than Python. This might seem trivial but it is a very powerful and much-needed feature when it comes to dependency management.

Python packages happen to have dependencies on programs/packages that are written in languages other than python. Pip is not able to handle these non-python dependencies like LLVM. HDF5 etc. properly. This might lead to the breaking of certain packages.

So we see Conda is in fact a step ahead of pip in handling dependencies.

2. Package installation

There is very important difference between how these two installs packages.

The python packages in PyPI are packaged as wheel or source distributions. This means we need to compile the package in our local machine before we can use it. The package compilation requires compatible compilers and libraries to be installed in our local machine before invoking the pip command.

Conda on the other hand uses compiled binaries that are downloaded from the Anaconda repository and cloud. This approach makes the installation process free of any compiler or library dependency problem.

3. Package Availability

Both the approaches of packaging and installing packages are valid and comes with their own set of advantages and disadvantages.

Conda makes installation easier and optimizes the user experience, whereas pip makes package maintenance easier for developers who otherwise would have been unnecessarily forced to compile their package for all platforms.

Package compilation is expensive in terms of both time and space. A large number of packages(more than 150,000) are published and maintained in the PyPI. Some of these packages are in fact personal projects or packages with some niche user base.

Conda sadly does(can) not support all the packages present in PyPI. The Conda repository and Cloud contains nearly 1,500+ packages that focus mainly on scientific computing and machine learning.

The difference between package availability is really evident and pip is by far the best package manager in terms of package availability.

Note: To install packages not present in Conda, you can use pip inside any Conda environment. Pip and Conda can be used simultaneously but it is usually not recommended.

4. Dependency management

The most important difference between pip and conda is how they solve the dependency problem.

Pip uses a recursive, serial loop for installing dependencies. Pip does not check to ensure that all the dependencies of all packages are fulfilled simultaneously.

If the package installed earlier in order have incompatible dependencies with versions relative to the packages installed later in that order, the environment is broken and most importantly this problem remains undetected until you find some strange errors.

Conda solves this problem using a satisfiability (SAT) solver to verify that all requirements of all packages installed in an environment are met. This check can take extra time but helps prevent the creation of broken environments. As long as package metadata about dependencies are correct, conda will predictably produce working environments.

So conda is generally a better choice when it comes to dependency management.

5. Virtual Environment Management

pip as we mentioned earlier is just a small tool for maintaining packages. Conda offers much more than that. It comes with a built in virtual environment manager.

With pip you need programs like pipenv, virutalenv for creating virtual environments. This is a design decision to keep pip focused on only package management and not make it bloated. pip and one of these environment managers can be used to create and manage virtual environments effectively.

Conda offers an out-of-the-box virtual environment manager. Not only does it provides virtual environment functionalities like virutalenv and pipenv we can choose the python version of each virtual environment. This feature helps users to work with outdated packages or packages only available in lower versions of python easier.

6. Minimalism

Pip is a simple command-line tool that aimed at doing only one thing. It is simple, modular, and minimalist by design.

Conda on the other hand was designed to provide an easy and all-in-one solution. It was meant to be an alternate approach to pip. It not at all minimal in its approach. Conda comes with a bunch of pre-installed packages and software.

The non-minimalist approach might be an undesired feature for some users. Conda tries to get over this by offering a smaller version of Conda: the Miniconda. Miniconda offers all the features of conda but installs only minimal packages required to set up conda.

Conclusion – Conda vs Pip

This brings us to the end of this article on pip and conda. Stay tuned for more such articles on python.