Build a multi-stage Docker image from official Python images with support for Poetry projects.
Updated following issues on the GitHub repository.
- Python images on docker.com
- Dockerfile on github.com/michaeloliverx
- Dockerfiles on mktr.ai
- Discussions on Github Poetry
- Multi-stage build
- Choosing a base version
- Stage: Staging
- Stage: Development
- Stage: Build
- Stage: Production
- Build our image and use it!
A multi-stage build allows:
- to stop at a specific step of a build
- start from different base thus beginning a new stage of the bulid
- pass artifacts from one build to another
I started this while looking for the best way to use both Docker and Poetry, and stumble upon an quite complete dockerfile at github.com/michaeloliverx/python-poetry-docker-example. The author use a multi-stage build to offer several images for development, testing, linting and production. Stopping at a specific images allows to minimize the image size and time to run for each step. However I was not completly happy with this example which was a bit too complex for me and I wanted to dig in anyway.
In our case we will have the following stages:
staging: Installs poetry and copy relevant source files
development: Install our project in editable mode
build: Build our project into a wheel file
production: Clean Python image that installs our built wheel
Choosing a base version
To start of our image we need to chose the base image, with two obvious options:
- Official Linux images (Ubuntu, Debian, RHEL)
- Official Python images
I discarded the first one as you don’t always have the most recent Python versions, and I am not so worried about performance differences pointed out by Python=>Speed.
Regarding official Python images on docker.com, we still have to chose the tag (i.e. flavor) we want:
(- ): based of a specific or not Debian version with common packages installed
-slim: based of a specific or not Debian version with only the strict requirements for working Python environment
-alpine: discarded because although much smaller, it brings complexity specific to the Alpine distribution
I chose 1 by default, the recommended version and including building tools that are needed anyway.
FROM python:3.10.0 as python-base
ARG and environment variables
We set up a few environment variables for Python, Pip and Poetry configurations.
A few things to keep in mind:
- Dockerfile doesn’t support reference to previously defined environment variables in the same
ARGdefined before a stage can be used by referencing them again in the stage.
- When using a different base image than the previous stages,
ENVvariables won’t be defined anymore. (seems obvious once said…)
ARG APP_NAME=coach_planner ARG APP_PATH=/opt/$APP_NAME ARG PYTHON_VERSION=3.10.0 ARG POETRY_VERSION=1.1.11
- Python process will be ran only once in the container so we don’t need to write the compiled Python files (*.pyc) to disk
- Make sure Python outputs are sent straight to terminal
- Make sure Python traceback are dumped (even on segfaults for instance)
- No need for the cache in the Docker image
ENV \ PYTHONDONTWRITEBYTECODE=1 \ PYTHONUNBUFFERED=1 \ PYTHONFAULTHANDLER=1 ENV \ POETRY_VERSION=$POETRY_VERSION \ POETRY_HOME="/opt/poetry" \ POETRY_VIRTUALENVS_IN_PROJECT=true \ POETRY_NO_INTERACTION=1
- We won’t update the pip version in any case
- Default timeout is only 15 seconds
- Fixing Poetry version
- Instead of /root
- Make sure the
.venvdirectory will be in the build directory
- No prompt from Poetry
- Path for building stages
- Add the virtual environment to path in a separate
ENVline to use previously defined environment variables.
- Update path with Poetry and virtual env path.
Installation follow Poetry’s official documentation and make use of the new install script supporting the upcoming Poetry version. We need to update our
PATH to be able to use poetry afterwards.
# Install Poetry - respects $POETRY_VERSION & $POETRY_HOME RUN curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/install-poetry.py | python ENV PATH="$POETRY_HOME/bin:$PATH"
Source file and dependencies
To copy our source files, we make the assumption that the directory structure follows the one poetry would create:
poetry-demo ├── pyproject.toml ├── poetry_demo │ └── __init__.py
We also obviously import the
WORKDIR $APP_PATH COPY ./poetry.lock ./pyproject.toml ./ COPY ./$APP_NAME ./$APP_NAME
Make sure to specify the
ARG we need in this stage.
FROM staging as development ARG APP_NAME ARG APP_PATH
Install our project
Nothing fancy here, we make use of poetry command.
WORKDIR $APP_PATH RUN poetry install
Flask webserver and entrypoint
In development mode we use the default flask webserver. We first define a few flask related environment variable, and then define an entrypooint as the
flask run command. This run the following command in the activated virtual environment of the project. This has several advantages:
- no direct
- express the intented use of this stage
- document how to use this stage
- easy command override while not bothering with the virtual environment. e.g.
docker run -it poetry flask shell.
ENV FLASK_APP=$APP_NAME \ FLASK_ENV=development \ FLASK_RUN_HOST=0.0.0.0 \ FLASK_RUN_PORT=8888 ENTRYPOINT ["poetry", "run"] CMD ["flask", "run"]
We can still access a shell by overriding the entrypoint, but that shouldn’t be the most common use case imho. Note that in this cas one should activate the environement.
We first use the
poetry build command, and add the
--flag wheel parameter to only build the wheel.
Then we use
poetry export to get a file containing dependency version contrainsts for our future pip installation. We pass the
--without-hashes but this could be removed and take part of
pip install --require-hashes.
FROM staging as build ARG APP_PATH WORKDIR $APP_PATH RUN poetry build --format wheel RUN poetry export --format requirements.txt --output constraints.txt --without-hashes
For our production, we will start from a clean python image, and install our freshly built application.
We redefined some Python related environemnt variable (required since we start from a fresh image), and we add some directly related to PIP. Note that again we reference our
ARG to be able to use them in this stage.
FROM python:$PYTHON_VERSION as production ARG APP_NAME ARG APP_PATH ENV \ PYTHONDONTWRITEBYTECODE=1 \ PYTHONUNBUFFERED=1 \ PYTHONFAULTHANDLER=1 ENV \ PIP_NO_CACHE_DIR=off \ PIP_DISABLE_PIP_VERSION_CHECK=on \ PIP_DEFAULT_TIMEOUT=100
Installating our application
We first retrieve the packaged application and constraints file from the
build stage using the
--from flag of the
CP command. Then we proceed with the installation using PIP.
Note that we make use of wildcards, but we could also add the application version similarly to Python and Poetry versions, to get a more determinist install.
WORKDIR $APP_PATH COPY --from=build $APP_PATH/dist/*.whl ./ COPY --from=build $APP_PATH/constraints.txt ./ RUN pip install ./$APP_NAME*.whl --constraint constraints.txt
We will use
gunicorn to serve our application in production.
We define two environment variables that will be used in the
gunicorn command allowing to be overriden (mostly for the
PORT). This can be used when deplyoing on GCP Cloud Run for instance.
# gunicorn port. Naming is consistent with GCP Cloud Run ENV PORT=8888 # export APP_NAME as environment variable for the CMD ENV APP_NAME=$APP_NAME
The difference between
CMD can be confusing especially when used together, and I would recommend reading Docker documentation on the topic. In our case we need shell variable substitution for the environment variable so it limits our choice. Other alternative would be to use a
COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh # 1 RUN chmod +x /docker-entrypoint.sh # 1 ENTRYPOINT ["/docker-entrypoint.sh"] # 2 CMD ["gunicorn", "--bind :$PORT", "--workers 1", "--threads 1", "--timeout 0", "\"$APP_NAME:create_app()\""] # 3
- Get the entrypoint script (see below) and make it executable
- With this syntax, this is equivalent to
- These arguments will be passed to the entrypoint scripts and can be overriden by the arguments passed to
And the script:
#!/bin/sh set -e # 1 eval "exec $@" # 2
- Will exit the script if any error occurs
- Expand arguments passed to the entrypoint script in the shell before passing them to
exec, so that it can support passing environment variable.
Build our image and use it!
Let’s build our image and try using it.
$ docker build --tag poetry --file docker/Dockerfile . [+] Building 33.3s (17/17) FINISHED => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 2.13kB 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for docker.io/library/python:3.10.0 1.2s => [internal] load build context 0.0s => => transferring context: 332B 0.0s => CACHED [staging 1/5] FROM docker.io/library/python:3.10.0@sha256:bb797f045026352 0.0s => => resolve docker.io/library/python:3.10.0@sha256:bb797f045026352ece65ab376c5666 0.0s => CACHED [production 2/6] WORKDIR /opt/coach_planner 0.0s => [staging 2/5] RUN curl -sSL https://raw.githubusercontent.com/python-poetry/poe 22.1s => [staging 3/5] WORKDIR /opt/coach_planner 0.1s => [staging 4/5] COPY ./poetry.lock ./pyproject.toml ./ 0.1s => [staging 5/5] COPY ./coach_planner ./coach_planner 0.1s => [build 1/2] WORKDIR /opt/coach_planner 0.1s => [build 2/2] RUN poetry build --format wheel 2.2s => [production 3/6] COPY --from=build /opt/coach_planner/dist/*.whl ./ 0.1s => [production 4/6] RUN pip install ./coach_planner*.whl 3.5s => [production 5/6] COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh 0.1s => [production 6/6] RUN chmod +x /docker-entrypoint.sh 0.6s => exporting to image 0.2s => => exporting layers 0.2s => => writing image sha256:a612b9cdb91bacb1cefd1393318a5d15238034183fd48dbbeafffdd7 0.0s => => naming to docker.io/library/poetry 0.0s
Note that there is only the
production stages here. The
development stage is not required for the final (i.e. default) stage, it is not even built.
$ docker run -p 8888:8888 -it poetry [2021-11-15 18:17:59 +0000]  [INFO] Starting gunicorn 20.1.0 [2021-11-15 18:17:59 +0000]  [INFO] Listening at: http://0.0.0.0:8888 (1) [2021-11-15 18:17:59 +0000]  [INFO] Using worker: sync [2021-11-15 18:17:59 +0000]  [INFO] Booting worker with pid: 11
Let’s change our gunicorn binding port:
$ docker run -p 8888:8888 --env PORT=5555 -it poetry [2021-11-17 18:21:28 +0000]  [INFO] Starting gunicorn 20.1.0 [2021-11-17 18:21:28 +0000]  [INFO] Listening at: http://0.0.0.0:5555 (1) [2021-11-17 18:21:28 +0000]  [INFO] Using worker: sync [2021-11-17 18:21:28 +0000]  [INFO] Booting worker with pid: 7
Now let’s use our development stage:
$ docker build --target development -t po etry --file docker/Dockerfile . [+] Building 1.4s (12/12) FINISHED => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 38B 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for docker.io/library/python:3.10.0 1.2s => [internal] load build context 0.0s => => transferring context: 260B 0.0s => [staging 1/5] FROM docker.io/library/python:3.10.0@sha256:bb797f045026352ece65ab 0.0s => => resolve docker.io/library/python:3.10.0@sha256:bb797f045026352ece65ab376c5666 0.0s => CACHED [staging 2/5] RUN curl -sSL https://raw.githubusercontent.com/python-poet 0.0s => CACHED [staging 3/5] WORKDIR /opt/coach_planner 0.0s => CACHED [staging 4/5] COPY ./poetry.lock ./pyproject.toml ./ 0.0s => CACHED [staging 5/5] COPY ./coach_planner ./coach_planner 0.0s => CACHED [development 1/2] WORKDIR /opt/coach_planner 0.0s => CACHED [development 2/2] RUN poetry install 0.0s => exporting to image 0.1s => => exporting layers 0.0s => => writing image sha256:cae979092d3b1e2c833e96f2e9acdfbd6980609f36d907a6547f05f1 0.0s => => naming to docker.io/library/poetry 0.0s
Here only the
developement stages are run!
And using it:
$ docker run -p 8888:8888 -it poetry * Serving Flask app 'coach_planner' (lazy loading) * Environment: development * Debug mode: on * Running on all addresses. WARNING: This is a development server. Do not use it in a production deployment. * Running on http://172.17.0.2:8888/ (Press CTRL+C to quit) * Restarting with stat * Debugger is active! * Debugger PIN: 774-185-805
Or overriding the default
CMD and getting a Python shell:
$ docker run -p 8888:8888 -it poetry python Python 3.10.0 (default, Oct 26 2021, 22:20:53) [GCC 10.2.1 20210110] on linux Type "help", "copyright", "credits" or "license" for more information. >>>
Or even overriding the default
ENTRYPOINT and then starting a poetry shell:
$ docker run -p 8888:8888 --entrypoint /bin/bash -it poetry root@4dd4bb68f510:/opt/coach_planner# poetry shell Spawning shell within /opt/coach_planner/.venv root@4dd4bb68f510:/opt/coach_planner# . /opt/coach_planner/.venv/bin/activate (.venv) root@4dd4bb68f510:/opt/coach_planner#