Installing Python in a Dockerfile Without the Python Base Image

Installing Python In A Dockerfile Wihtout Base Image

Docker is a popular containerization technology that allows applications to run in isolated environments called containers. Containers share the host machine’s OS kernel but run as isolated processes, allowing portability across environments.

Also read: Python on Docker: How to Host a Python Application in a Docker Container?

A key benefit of Docker is the ability to define reusable images using a Dockerfile. The Dockerfile lists the commands to assemble the image. When we build an image, Docker will execute these commands and cache intermediate results. This makes building derivative images very fast.

By default, a Dockerfile starts with a base image like Ubuntu or Alpine Linux. We can install additional software on top of this base image to create our stacks.

A common practice is to start with language-specific base images like python or node which have the language runtime and package managers pre-installed. However, sometimes our use case may require starting with a different base image.

In this article, we will discuss strategies for installing Python in a Dockerfile when not using the Python base image.

Why Install Python in a Non-Python Base Image?

Here are some reasons you may need to install Python in a non-Python base image:

  • Integrate Python into an application stack based on a different language (e.g. Ruby, Java)
  • Include Linux tools needed for your workflow (e.g. awksed) more easily by using a system base image over Python
  • Leverage Docker multi-stage builds to produce a production image without the Python dev dependencies
  • Customize and optimize the Python installation instead of relying on the official Python image

2 Methods for Installing Python

When installing Python in a non-Python base image, we need to ensure we install both the Python interpreter and the Python package manager (pip). This allows us to not only run Python code but also install additional libraries.

Here are some effective strategies for installing Python:

1. Using the System Package Manager

Most base images include a system package manager like apt (Debian/Ubuntu) or apk (Alpine Linux) to install system packages. We can use these to install Python:

FROM ubuntu:20.04
RUN apt-get update \
  && apt-get install -y python3 python3-pip \  
  && rm -rf /var/lib/apt/lists/*

This allows the installation system Python packages. It also installs pip for installing Python libraries.

Pros:

  • Leverages system package management tooling
  • Simple workflow similar to installing locally

Cons:

  • Can be slow to install full-build toolchain dependencies
  • May install older Python versions

2. Install From Source

We can manually install Python by downloading the source, compiling it, and installing it:

FROM alpine:3.12  

ENV PYTHON_VERSION 3.11.0

RUN apk add --no-cache --virtual .build-deps \
        bzip2-dev \
        zlib-dev \
        xz-dev \
    && wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz \  
    && tar xzf Python-$PYTHON_VERSION.tgz \
    && cd Python-$PYTHON_VERSION \ 
    && ./configure \
    && make -j $(nproc) \
    && make install \
    && rm -rf Python-$PYTHON_VERSION.tgz Python-$PYTHON_VERSION \
    && apk del .build-deps

This downloads the specified Python version source, compiles it, installs it, and then removes the build dependencies and source.

Pros:

  • Can install specific Python versions
  • Removes build dependencies for smaller image

Cons:

  • Complex build process
  • Compiling from source is slower

Installing Pip and Virtualenvs

Once we have Python installed, we likely want to install pip and virtualenv support:

RUN python3 -m ensurepip --upgrade \
  && pip3 install --no-cache --upgrade pip setuptools wheel virtualenv

This utilizes ensurepip to install pip and upgrades it. We install some common pip packages like setuptools and virtualenv to enable virtual environment support.

We can create a virtualenv for our application dependencies instead of installing them globally:

RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" 

This creates a virtualenv at /opt/venv and sets it as the default Python environment.

Python Development vs Production Images

It’s common to use separate Dockerfiles for development and production images.

The Python development image would include system dependencies, Python source, and dev packages like ipythonpytesttox, etc.

The production image only needs the Python runtime and application dependencies. We can build this using a multi-stage Dockerfile:

FROM python:3.9-slim-buster as builder

WORKDIR /app 

COPY requirements.txt requirements.txt
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /wheels -r requirements.txt

FROM gcr.io/distroless/python3
COPY --from=builder /wheels /wheels
COPY app.py .

ENV PATH="/opt/venv/bin:$PATH" 

RUN pip install --no-cache /wheels/* \
  && rm -rf /wheels

ENTRYPOINT ["python", "app.py"]

This uses a temporary Python image to build dependencies and then copies them to a minimal base image. The result is a smaller production image.

Final Thoughts

Installing Python in non-Python base Docker images unlocks additional use cases. By leveraging system package managers, compiling from source, using pre-built distros, and managing development/production builds – we can create optimized containers tailored to our stack and workflow.

The key is understanding the base image and what Python installation options it supports. Alpine and Debian-based images have different tooling for compiling and installing from source. Ubuntu and Debian allow installing Debian packages.

With the above techniques, we can integrate Python into any Dockerfile while having control over the specific versions and optimization for development or production environments.