iris-docker-multi-stage-script
A python script to keep your docker iris images in shape ;)
Witout changing your dockerfile or your code you can reduce the size of your image by 50% or more !
TL;DR
Name the builder image builder
and the final image final
and add this to end of your Dockerfile:
Modify your Dockerfile to use a multi-stage build:
ARG IMAGE=intersystemsdc/irishealth-community:latest
FROM $IMAGE as builder
Add this to end of your Dockerfile:
FROM $IMAGE as final
ADD --chown=${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} https://github.com/grongierisc/iris-docker-multi-stage-script/releases/latest/download/copy-data.py /irisdev/app/copy-data.py
RUN --mount=type=bind,source=/,target=/builder/root,from=builder \
cp -f /builder/root/usr/irissys/iris.cpf /usr/irissys/iris.cpf && \
python3 /irisdev/app/copy-data.py -c /usr/irissys/iris.cpf -d /builder/root/
Boom! You're done!
Usage
usage: copy-data.py [-h] -c CPF -d DATA_DIR [--csp] [-p] [-o OTHER [OTHER ...]]
Copy data from a directory to the IRIS data directory
optional arguments:
-h, --help show this help message and exit
-c CPF, --cpf CPF path to the iris.cpf file
-d DATA_DIR, --data_dir DATA_DIR
path to the directory where the data files are located
--csp toggle the copy of the whole CSP folder
-p, --python toggle the copy of python libs
-o OTHER [OTHER ...], --other OTHER [OTHER ...]
toggle the copy of other folders
How to use it
First have a look at a non-multi-stage Dockerfile for iris:
ARG IMAGE=intersystemsdc/irishealth-community:latest
FROM $IMAGE
WORKDIR /irisdev/app
RUN chown ${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} /irisdev/app
USER ${ISC_PACKAGE_MGRUSER}
# copy source code
COPY src src
COPY misc misc
COPY data/fhir fhirdata
COPY iris.script /tmp/iris.script
COPY fhirUI /usr/irissys/csp/user/fhirUI
# run iris and initial
RUN iris start IRIS \
&& iris session IRIS < /tmp/iris.script \
&& iris stop IRIS quietly
This is a simple Dockerfile that will build an image with the iris source code and the fhir data. It will also run the iris.script to create the fhir database and load the data.
With this kind of dockerfile you will end up with a big image. This is not a problem if you are using a CI/CD pipeline to build your images. But if you are using this image in production you will end up with a big image that will take a lot of space on your server.
Then have a look at a multi-stage Dockerfile for iris
ARG IMAGE=intersystemsdc/irishealth-community:latest
FROM $IMAGE as builder
WORKDIR /irisdev/app
RUN chown ${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} /irisdev/app
USER ${ISC_PACKAGE_MGRUSER}
# copy source code
COPY src src
COPY misc misc
COPY data/fhir fhirdata
COPY iris.script /tmp/iris.script
COPY fhirUI /usr/irissys/csp/user/fhirUI
# run iris and initial
RUN iris start IRIS \
&& iris session IRIS < /tmp/iris.script \
&& iris stop IRIS quietly
# copy data from builder
FROM $IMAGE as final
ADD --chown=${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} https://github.com/grongierisc/iris-docker-multi-stage-script/releases/latest/download/copy-data.py /irisdev/app/copy-data.py
RUN --mount=type=bind,source=/,target=/builder/root,from=builder \
cp -f /builder/root/usr/irissys/iris.cpf /usr/irissys/iris.cpf && \
python3 /irisdev/app/copy-data.py -c /usr/irissys/iris.cpf -d /builder/root/
This is a multi-stage Dockerfile that will build an image with the iris source code and the fhir data. It will also run the iris.script to create the fhir database and load the data. But it will also copy the data from the builder image to the final image. This will reduce the size of the final image.
Let read in details the multi-stage Dockerfile:
ARG IMAGE=intersystemsdc/irishealth-community:latest
FROM $IMAGE as builder
Define the base image and the name of the builder image
WORKDIR /irisdev/app
RUN chown ${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} /irisdev/app
USER ${ISC_PACKAGE_MGRUSER}
# copy source code
COPY src src
COPY misc misc
COPY data/fhir fhirdata
COPY iris.script /tmp/iris.script
COPY fhirUI /usr/irissys/csp/user/fhirUI
# run iris and initial
RUN iris start IRIS \
&& iris session IRIS < /tmp/iris.script \
&& iris stop IRIS quietly
Basically the same as the non-multi-stage Dockerfile
FROM $IMAGE as final
Start with the base image
ADD --chown=${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} https://github.com/grongierisc/iris-docker-multi-stage-script/releases/latest/download/copy-data.py /irisdev/app/copy-data.py
Add the copy-data.py script to the image with the right user and group
RUN --mount=type=bind,source=/,target=/builder/root,from=builder \
cp -f /builder/root/usr/irissys/iris.cpf /usr/irissys/iris.cpf && \
python3 /irisdev/app/copy-data.py -c /usr/irissys/iris.cpf -d /builder/root/
A lot is happening here.
First we are using the --mount option to mount the builder image.
- --mount=type=bind is the type of mount
- source=/ is the root of the builder image
- target=/builder/root is the root of the builder image mounted in the final
- from=builder is the name of the builder image
Then we are copying the iris.cpf file from the builder image to the final image.
cp -f /builder/root/usr/irissys/iris.cpf /usr/irissys/iris.cpf
Finally we are running the copy-data.py script to copy the data from the builder image to the final image.
python3 /irisdev/app/copy-data.py -c /usr/irissys/iris.cpf -d /builder/root/
Side by side comparison
Non multi-stage Dockerfile
ARG IMAGE=intersystemsdc/irishealth-community:latest
FROM $IMAGE
WORKDIR /irisdev/app
RUN chown ${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} /irisdev/app
USER ${ISC_PACKAGE_MGRUSER}
COPY . .
RUN iris start IRIS \
&& iris session IRIS < /tmp/iris.script \
&& iris stop IRIS quietly
Multi-stage Dockerfile
ARG IMAGE=intersystemsdc/irishealth-community:latest
FROM $IMAGE as builder
WORKDIR /irisdev/app
RUN chown ${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} /irisdev/app
USER ${ISC_PACKAGE_MGRUSER}
COPY . .
RUN iris start IRIS \
&& iris session IRIS < /tmp/iris.script \
&& iris stop IRIS quietly
FROM $IMAGE as final
ADD --chown=${ISC_PACKAGE_MGRUSER}:${ISC_PACKAGE_IRISGROUP} https://github.com/grongierisc/iris-docker-multi-stage-script/releases/latest/download/copy-data.py /irisdev/app/copy-data.py
RUN --mount=type=bind,source=/,target=/builder/root,from=builder \
cp -f /builder/root/usr/irissys/iris.cpf /usr/irissys/iris.cpf && \
python3 /irisdev/app/copy-data.py -c /usr/irissys/iris.cpf -d /builder/root/
Thanks, this is very useful. I've just tested this on an image with various build steps, and this saves us quite a bit of image space. The copy step now adds a little under 300MB to the base image, instead of almost 5GB (!). Squashing an image has the same effect, but prevents layer caching, so each push to a docker repository would upload the entire image. Your way, after the first time, presumably just the 300MB. Nice!
Interestingly, we've had already contacted our sales engineers about the massive amount of image disk space used after our build steps. I couldn't find what it's used for; the actual Linux filesystem is way smaller. The build steps also don't visibly use significant disk space, that I could find. I'm hoping InterSystems manages to do something about this in the future. In the meantime, we've got a nice workaround. Thanks again!
You are welcome.
And I'm very happy that you find this trick useful. :)
Great stuff!
If this increases the building time? How much?
In our case, it adds 2.3s to a build that takes way longer than that. (Our build creates Foundation namespaces, and loading the FHIR resources takes insanely long.) I expect this to be Python startup time, plus some time proportional to the amount of data to copy (roughly 300MB in my test).
Thanks @Gertjan Klein , it is not that bad for the value it provides
I've just released the new IRIS DEV template that uses your approach, @Guillaume Rongier
It gives for vanilla IRIS Data platform Community Edition 700MB economy: 2.1G instead of 2.7G,
And for vanilla IRIS For Health Community Edition it gives 3.2G(!) economy: 2.8G instead of 6G...
Want to save some space building your dev images with IRIS - use the approach listed here, or directly use our dev template.
Looks great. Where's the source for
copy-data.py
?https://github.com/grongierisc/iris-docker-multi-stage-script/blob/main/...
Hi Guillaume,
I have also tested this trick and it is reducing the size of the image upto a good extent.
With non multi-stage docker file the size of the image is around 7.57GB
And with multi-stage docker file the size is 3.77GB. Nice !
But when I inspect the image the architecture of the image is changing from amd to arm.
Here is my docker file :
without multi-stage docker file :
docker inspect <imageid>
with multi-stage docker file:
You didn't specify the
FROM --platform=linux/amd64 $IMAGE
on the final part
💡 This article is considered InterSystems Data Platform Best Practice.