Building GROMACS in docker part 2
Published:
In the previous post we discussed how the gromacs docker container was originally made, and what option we went through to improve the workflow finally settling on GitHub Actions. In this part we will describe how we did this.
The build consists of 4 steps:
- Get the build parameters (GROMACS version, SIMD types and so on)
- Build a container containing FFTW optimised for all SIMD types.
- Build a GROMACS container optimised for each SIMD type, using the FFTW container.
- Combine all optimised GROMACS container into one container.
Get the build parameters
This took the most amount of work - using gromacs-hpccm-recipes-mult-stages
we are able to build Dockerfiles for almost any configuration of GROMACS we can think of. Parametrising most of these variables was simple, they can be specified at the beginning of the file and used later.
Variables can be defined and set in a step, job. Additionally they can be defined at the beginning of a workflow file, making them global to the file. Here we defined the variable gromacs
to hold the gromacs version.
---
name: Build and push to Docker Hub
on: [push,pull_request]
env:
gromacs: 2020.2
The variables can then be retrieved by calling them as so:
$
Using these variables, we can specify all the variables we need for creating Dockerfiles.
---
name: Build and push to Docker Hub
on: [push,pull_request]
env:
# docker_repo: This must be changed between forks. This should be the
# dockerhub repository you will be using to register the
# docker containers to.
docker_repo: longr/gromacs-docker
# Here you can specify the versions of the
# various parameters, such as gcc or cuda.
fftw: 3.3.8
fftw_md5: 8aac833c943d8e90d51b697b27d4384d
gromacs: 2020.2
cmake: 3.17.1
gcc: 8
cuda: 10.1
openmpi: 4.0.0
ubuntu: 18.04
rdtscp: off
# additional_simd_types: This is where you need to specify any
# SIMD types that you want to build a
# container for.
additional_simd_types: "sse2 avx avx2"
Next we need to create a Docker container for the optimised version of FFTW. We can create a file for this quite simply:
FROM ubuntu:18.04
ARG FFTW_VERSION=3.3.8
ARG FFTW_MD5=8aac833c943d8e90d51b697b27d4384d
# install required packages
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
software-properties-common \
&& add-apt-repository ppa:ubuntu-toolchain-r/test \
&& apt-get update \
&& apt-get install -y --no-install-recommends \
build-essential \
curl \
gcc-9 \
&& rm -rf /var/lib/apt/lists/*
# Install fftw with more optimizations than the default packages. It
# is not critical to run the tests here, since the GROMACS tests will
# catch fftw build errors too.
RUN curl -o fftw.tar.gz http://www.fftw.org/fftw-${FFTW_VERSION}.tar.gz \
&& echo "${FFTW_MD5} fftw.tar.gz" > fftw.tar.gz.md5 \
&& md5sum -c fftw.tar.gz.md5 \
&& tar -xzvf fftw.tar.gz && cd fftw-${FFTW_VERSION} \
&& ./configure --disable-double --enable-float --disable-static --enable-shared \
&& make \
&& make install \
&& rm -rf fftw.tar.gz fftw.tar.gz.md5 fftw-${FFTW_VERSION}
The problem with this Dockerfile is that the FFTW version, and SIMD types are hard coded. To remove these we can use sed
in the workflow file:
# Builds the fftw container needed for the final combined container.
build_fftw_container:
runs-on: ubuntu-18.04
if: "!contains(github.event.head_commit.message, 'ci skip')"
steps:
- uses: actions/checkout@master
# Find and replace default fftw with version specified at top
- name: Set FFTW version
run: |
sed -i "s/3.3.8/$/g" fftw/Dockerfile
sed -i "s/8aac833c943d8e90d51b697b27d4384d/$/g" fftw/Dockerfile
sed -i "s/18.04/$
Here we use sed
to replace the hard coded values with the variables we specified at the beginning of the file. The next thing we need to do is add the optimisation flags to the compilation line. We do this with sed again. We need an extra part to the command this time. The normal sed
syntax for find and replace is
sed -i "s/<find>/<replace>/g" <file>
Instead of doing this, by adding &
at the beginning of the replacement string we tell sed
that instead of replacing the found expression, it should be appended to the end.
sed -i "s/<find>/&<replace>/g" <file>
This way we can search for ./configure
and add an optimisation flag --enable-simd_type
which results in ./configure --enable-simd_type
. If we do this once per simd type we can build fftw to be optimised for all the simd types. We can do this using a for
loop.
# Loop over simd types and append to build command in Dockerfile.
- name: Add SIMD types
run: |
for type in $
do
sed -i "s/configure/& --enable-$type /g" fftw/Dockerfile
done
The last thing we then need to do to create the FFTW container is to build and push the docker container to DockerHub. We split this into 3 steps: First, build the container from our now modified Dockerfile; Secondly, Login to DockerHub using the CLI; and finally push the container to DockerHub.
- name: Build fftw container
run: |
docker build -t "$:fftw-$" -f fftw/Dockerfile .
- name: Docker Login
run: docker login -u $ -p $
- name: Docker Push
if: "$"
run: |
docker push "$:fftw-$"
sleep 60
# Needed to give time to register container before being
# Pulled by next steps.
Building optimised containers
The next thing we need to do is create and build the optimised containers for each SIMD type. GitHub Actions allows us to create a matrix of build options to be used in a job. We can create a key: value
pair and assign it an array of values.
strategy:
matrix:
simd_type: ['avx', 'sse2']
steps:
name: Build Container
run: Build $
This is how we first created this, but this required modifying the workflow file in multiple locations which was undesirable. A solution to this was found with fromJSON which allows a JSON to be used instead of key: value
pairs. To specify the JSON you need to use the echo
command in a job. The example github use is:
name: build
on: push
jobs:
job1:
runs-on: ubuntu-latest
outputs:
matrix: $
steps:
- id: set-matrix
run: echo "::set-output name=matrix::{\"include\":[{\"project\":\"foo\",\"config\":\"Debug\"},{\"project\":\"bar\",\"config\":\"Release\"}]}"
job2:
needs: job1
runs-on: ubuntu-latest
strategy:
matrix: $
steps:
- run: build
This is a complex statement to edit, but as it is just a string we can create it from a much simpler variable, the one we specified earlier additional_simd_types
.
We now need to create a job that will create and export the JSON we need from our input variable:
get_builds:
runs-on: ubuntu-18.04
outputs:
matrix: $
steps:
- id: set-matrix
run: |
for type in $; do SIMD=$SIMD{\"simd\":\"$type\"}; done;
SIMD=`echo $SIMD | sed 's/}{/},{/g'`
echo "::set-output name=matrix::{\"include\":[$SIMD]}"
This job creates the JSON which can then be used as an input to the build job:
# Build sub containers, one for each SIMD type
build_subcontainer:
needs: [build_fftw_container, get_builds]
# Fetch JSON created from additional_simd_types
strategy:
matrix: $
runs-on: ubuntu-18.04
To create the Dockerfiles for our optimised containers we need two things: Python and gromacs-hpccm-recipes-mult-stages
which is kept as a submodule in our repository, as such we need to pass an option to checkout
to checkout the submodule as well.
if: "!contains(github.event.head_commit.message, 'ci skip')"
steps:
- uses: actions/checkout@v2
with:
submodules: 'true'
We then need to checkout Python
- uses: actions/setup-python@v1
with:
python-version: "3.7"
- name: Install python dependencies
run: |
set -xe
python -VV
python -m site
python -m pip install --upgrade pip
python -m pip install hpccm
We now need to create the Dockerfile using gromacs-hpccm-recipes-mult-stages
:
# The Dockerfiles must be generated based on SIMD type, Gromacs
# version and CUDA version
- name: Generate Dockerfiles
env:
docker_tag: gmx-$-cuda-$-$
run: |
cd gromacs-hpccm-recipes-mult-stages
python3 generate_specifications_file.py \
--format docker \
--gromacs $ \
--ubuntu $ \
--gcc $ \
--cuda $ \
--cmake $ \
--engines simd=$:rdtscp=$ \
--fftw-container $:fftw-$ \
--regtest \
> Dockerfile
Finally we need to build and register the Docker container as we did for FFTW:
- name: Build the Docker image
env:
docker_tag: gmx-$-cuda-$-$
run: |
cd gromacs-hpccm-recipes-mult-stages
docker build -t "$:$" -f Dockerfile .
- name: Docker Login
run: docker login -u $ -p $
- name: Docker Push
env:
docker_tag: gmx-$-cuda-$-$
if: "$"
run: |
docker push "$:$"
This then builds and register one optimised gromacs container for simd type specified additional_simd_types
.
Creating a combined container
As with the FFTW container, we have a pre-built Dockerfile:
FROM nvidia/cuda:10.2-runtime-ubuntu18.04
# install required packages
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libgomp1 \
liblapack3 \
openmpi-bin \
openmpi-common \
python3 \
&& rm -rf /var/lib/apt/lists/*
## Add the fftw3 libraries
#COPY --from=gromacs/fftw /usr/local/lib /usr/local
# Add the GROMACS configurations
# Add architecture-detection script
COPY gmx-chooser /gromacs/bin/gmx
RUN chmod +x /gromacs/bin/gmx
ENV PATH=$PATH:/gromacs/bin
We first need to change the Ubuntu versions and CUDA version to those specified in a variables at the beginning:
# Combine all containers into one primary container and
# publish to docker hub
build_final_container:
needs: build_subcontainer
runs-on: ubuntu-18.04
if: "!contains(github.event.head_commit.message, 'ci skip')"
# Only combine and push to Docker Hub if we are on dev branch (TODO: master) and
# this is not a pull request. Skip if commit message is "ci skip"
steps:
- uses: actions/checkout@master
- name: Edit Dockerfile- Set repo
run: |
sed -i "s|gromacs/gromacs-docker|$|g" Dockerfile
- name: Edit Dockerfile- Set CUDA version
run: |
sed -i "s|FROM nvidia/cuda:10.2|FROM nvidia/cuda:$|g" Dockerfile
- name: Edit Dockerfile- Set Ubuntu version
run: |
sed -i "s|runtime-ubuntu18.04|runtime-ubuntu$|g" Dockerfile
We then need to add the lines to tell Docker to copy the gromacs binaries from our optimised containers. We do this by looping over additional_simd_types
and using the a
command in sed.
sed -i 's/<find>/a <replace>/g'
This tells sed
to find any instance of ‘find’ and insert ‘replace’ on the line after. You will also notice that instead of using ‘/’ as the delimiter in sed
we have used |
. sed
does not care what is used as the delimiter, and by using |
we don’t have to escape every forward slash that we use.
- name: Edit Dockerfile- Set SIMD types to be loaded
run: |
for type in $
do
sed -i "s|GROMACS configurations|a COPY --from=gromacs/gromacs-docker:gmx-$-cuda-$-$type /gromacs /gromacs|g" Dockerfile
done
Finally we need to build and publish the docker container:
```yaml
- name: Build the combined Docker image
env:
docker_tag: gmx-$-cuda-$
run: |
docker build -t "$:$" -t "$:latest" -f Dockerfile .
- name: Docker Login
run: docker login -u $ -p $
- name: Docker Push version tag
env:
docker_tag: gmx-$-cuda-$
run: |
docker push "$:$"
This should then build and tag the Docker container, and push it to docker hub. Experience (and the lines in the code to ensure it) suggests there is a delay between pushing and being able to pull the container of about a minute.