To build, manage and ship the docker image easily and occupy less space when pulled locally
Docker engine makes it possible to take a piece of code with all its dependency and run in a container using its image. The whole concept of image is based upon Dockerfile,
a configuration file to build an image. It is good practice to follow some rules while creating Dockerfile.
In this hands-on lab, with examples and hands-on experience, one will find the different ways to minimize images so it becomes easy to manage and ship images for different environments.
Why One Requires To Optimize The Image Size?
Unwanted files, libraries or recommended packages oversize the image and large images are difficult to manage and occupy more space when pulled locally.The main advantage of small Docker images is the cost reduction, small images use less memory space, can be shipped and downloaded faster which may result in reduced container start time. Larger images may have a source code which is not required while the image is running in the container.
From the cloud perspective, Smaller images can be uploaded to the cloud easily and are easy to maintain. Public cloud providers follow pay-as-you-go costing. To get the benefit one needs to minimize the image. From the Kubernetes perspective, The smaller the size of the image the faster the pod starts.
Pre-Requisite
To derive the most value from this blog, It is recommended to participate in hands-on activities with your local machine. First install Docker as per your operating system and proceed further.
Different Ways To Minimize The Docker Image Size
There are multiple ways to minimize the image size as per the requirements.
- Minimize the number of layers
- Do not install unnecessary packages
- After installation, do the clean-up
- Use the smallest base images
- Use
.dockerignore
file - Multistage builds
- Store application data in another place
- Take benefit of caching
- Do not install editors using Dockerfile
- Use
docker --squash
flag at build time
The below-mentioned Dockerfile is taken as an example to decrease the image size,
- To install Docker in the ubuntu linux distribution,
apt update && apt install docker.io -y
- To Create a Dockerfile
vim Dockerfile
FROM python:3.9 RUN apt-get update RUN apt-get install nginx -y RUN apt-get install php -y
- To build an image from Dockerfile
docker image build -t <dockerhub_username>/<image_name>:<tag> .
docker image build -t pratikshahp/python:learn .
- To list the images
docker images
Image built with Simple Dockerfile: 975 MB
Minimize The Number Of Layers
In this technique, we must minimize the number of layers created at build time. Each instruction like FROM, COPY, ADD, RUN
etc in Dockerfile creates its own layer and intermediate containers, which may increase the image size.
Two or more RUN
commands with &&
operator creates a single layer and decreases the size of the resulting image.
FROM base_image RUN apt-get update -y && apt install -y \ package_one \ package_two
Before minimizing the number of layers in earlier built image whose size was 975 MB.
- Replace previous Dockerfile with below mentioned code,
vim Dockerfile
FROM python:3.9 RUN apt-get update && \ apt-get install nginx php -y
- To build the image
docker image build -t pratikshahp/python:learn .
- To list the images
docker images
The image size is 972 MB (3 MB reduced)
Do not Install Unnecessary Packages
Install packages with apt-get
one should add the --no-install-recommends
tag. This will not install the recommended packages that come with the main. This will decrease the image size and not affect the quality of the required image.
FROM base_image RUN apt-get update -y && apt install -y --no-install-recommends \ [packages]
Install with recommended packages the image size was 972 MB.
- To modify Dockerfile
vim Dockerfile
FROM python:3.9 RUN apt-get update && \ apt-get install --no-install-recommends nginx php -y
- To build the image
docker image build -t pratikshahp/python:learn .
- To list the images
docker images
After Do not Install Recommends the size is 960 MB (more 12 MB reduced)
After Installation Do The Clean-Up
To install packages some temporary files are used. After installation is finished these files are no longer required. One can clean this up by adding rm -rf /var/lib/apt/lists/*
to the same line from which packages are installed.
FROM base_image RUN apt-get update -y && apt install -y --no-install-recommends \ package_one \ package_two \ && rm -rf /var/lib/apt/lists/*
Without cleaning up after installation the image size was 960 MB.
- Change the Dockerfile
vim Dockerfile
FROM python:3.9 RUN apt-get update && \ apt-get install --no-install-recommends nginx php -y \ && rm -rf var/lib/apt/lists/*
- To build the image
docker image build -t pratikshahp/python:learn .
- To list the images
docker images
Clean Up after installation the size is 942 MB (more 18 MB reduced)
Use the Smallest Base Images
Base images fall into two main categories:
- Base images that are created with
full-scale os
like Ubuntu. - The images which are based on
minimal os
like Alpine Linux.
Smaller base images make build operations significantly faster. They are more secure too, as they have a smaller attack surface with respect to size.
FROM base_image:alpine
Alpine is not always a good choice as a base image. Following are the issues with it.
- Compatibility: Alpine has the limitation that it uses musl C-library instead of the much more common glibc C-library. This often causes compatibility problems, because Most Linux use glibc at the top of the Kernel Space to interact with user space.
- Dependencies: Alpine installed dependency (packages) with apk add instead of apt-get install. All packages may not be available with alpine.
Using a full scale base image the size was 942 MB.
- Modify Dockerfile
vim Dockerfile
FROM python:3.9-slim RUN apt-get update && \ apt-get install --no-install-recommends nginx php -y \ && rm -rf var/lib/apt/lists/*
- To build the image
docker image build -t pratikshahp/python:learn .
- To list the images
docker images
After using python:3.9-slim (minimal base image of python) as a base image the size is 258 MB (more 684 MB reduced)
Use other smaller base image like alpine,
- To Modify Dockerfile
vim Dockerfile
FROM python:alpine RUN apk update && \ apk add nginx php
- To build the image
docker image build -t pratikshahp/python:learn .
- To list the images
docker images
After using alpine as a base image the size is only 71.5MB (more 186.5MB reduced with lesser build time)
Note:It is recommended to use alpine as a base image if one needs to use less disk space.
Use .dockerignore File
Dockerfile
is a text file that is used to build the customized image based on the instruction written in.The .dockerignore
file is a list of files that do not attach with an image while building.It is necessary to add a .dockerignore
file in the same folder where the Dockerfile resides.
.dockerignore file.
password.txt secret.txt logs .git *.md .cache
How much size is decreased may depend upon requirements of applications and the file size one includes in a .dockerignore
file.
Alert: One may include Dockerfile in .dockerignore file. While building an image it will get an error message.
Multistage Builds
Multistage build is used to optimise Dockerfiles so it can be easy to read, maintain and modify. One can use multiple Dockerfile for development and production. To reduce the resulting image size we need to write a multi stage build. Developers will not share the source to the client; they package the binary only.
For Example, To run a C program, build an image with a simple Dockerfile and with a multistage Dockerfile and check the size of images.
- To create a C file
vim hello.c
#include <stdio.h> int main() { printf("Hello Everyone!!\nWelcome to Home Page\n"); return 0; }
- To create a script file
vim myscript.sh
#!/bin/sh export DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket # run our binary /usr/src/app/hello
- To write a Dockerfile
vim Dockerfile
FROM ubuntu AS buildstep RUN apt-get update && apt-get install -y build-essential gcc WORKDIR /usr/src/app COPY ./hello.c hello.c COPY ./myscript.sh /app/myscript.sh RUN gcc -o hello hello.c && chmod +x hello ENV INITSYSTEM=on CMD ["bash", "/app/myscript.sh"]
- To build an image from a simple Dockerfile
docker image build -t pratikshahp/simpleimg:capp .
- To run the container from the previous image
docker container run pratikshahp/simpleimg:capp
- To write Multistage Dockerfile
vim MultiStageDockerfile
FROM ubuntu AS buildstep RUN apt-get update && apt-get install -y build-essential gcc COPY hello.c /app/hello.c WORKDIR /app RUN gcc -static -o hello hello.c && chmod +x hello FROM alpine RUN mkdir -p /usr/src/app/ WORKDIR /usr/src/app COPY --from=buildstep /app/hello ./hello COPY ./myscript.sh ./myscript.sh CMD ["/bin/sh", "/usr/src/app/myscript.sh"]
- To build an image from a Multi Stage Dockerfile
docker build -t pratikshahp/multistagebuiltimg:capp -f MultiStageDockerfile .
- To run the container from the previous multistage build image
docker container run pratikshahp/multistagebuiltimg:capp
Both containers run with different images and execute a “C” Program the same way!
- To verify the size of both images simple image and multistage build image
docker images
With a simple image the size is 394 MB and with multistage the image size is significantly reduced (7.95 MB only).
Store Application Data In Another Place
Store application data in the image while building, will increase the size. To overcome that, use the volume
feature of the container runtimes to isolate the image and application data.
Take a benefit of Caching
To use the cache efficiently one should write the correct order
of instruction in Dockerfile. If the middle step changes, all the following steps have to be rebuilt.
FROM Ubuntu WORKDIR /<dir_name> COPY . . # Copy all files in the current directory RUN apt-get install package_one package_two # Install packages RUN apt-build [options] # Run build
Write expensive commands near the starting of the Dockerfile and write the commands that change frequently near the end of the Dockerfile, to minimize the rebuilds.
In the above scenario change in any file of the project will reinstall all packages while every new image builds even though the packages and management files didn’t change.
Now change the scenario, and split the COPY command in two parts.
- copy the package management files and install the packages.
- copy the project files, which may change.
FROM Ubuntu WORKDIR /<dir_name> COPY source.list dpkg . # Copy package management files RUN apt-get install package_one package_two # Install packages COPY . . # Copy project files RUN apt-build [options] # Run build
Note: Caching will not reduce the image size but it reduces the build time.
- Take a Dockerfile
vim Dockerfile
FROM python:3.9 RUN apt-get update && apt-get install nginx -y RUN apt-get install php -y WORKDIR /myapp/ RUN date COPY . .
- To build the image
docker image build -t pratikshahp/img:learn .
- Now, change last two layers of Dockerfile.
vim Dockerfile
FROM python:3.9 RUN apt-get update && apt-get install nginx -y RUN apt-get install php -y WORKDIR /myapp/ COPY . . RUN date
Again build the image with modified Dockerfile
docker image build -t pratikshahp/imgcache:learn .
- One more time modify Dockerfile from 3rd layer
vim Dockerfile
FROM python:3.9 RUN apt-get update && apt-get install nginx -y RUN date RUN apt-get install php -y WORKDIR /myapp/ COPY . .
- To build the image
docker image build -t pratikshahp/testimg:learn .
Caching is not used after the 3rd layer and php package will install again.
Do not Install Editors Using Dockerfile
Developers may use the tools like, curl/vim/nano
in the Dockerfiles, which results in a large image. To get the benefits of debugging inside the container install it in the development phase and remove
it after finishing.
Use Docker –squash Flag At Build Time
Build the Docker image using the --squash
flag to squash some Docker layers and create a resulting image with fewer layers.
- First check docker version
docker version
In server configuration, if experimental : false
then –squash will not work. To enable this,
- Create a /etc/docker/daemon.json file
vim /etc/docker/daemon.json
{ "experimental": true }
- Restart docker daemon to reflect the change
systemctl restart docker
- To check experimental value
docker version -f '{{.Server.Experimental}}'
If it gets true
then after use --squash
flag.
If dockerfile is in same directory,
docker image build --squash -t <imagename> .
Using same Dockerfile create two different images
vim Dockerfile
FROM python:3.9 RUN apt-get update RUN apt-get install nginx -y RUN apt-get install php -y
- To build an image without
--squash
flag
docker image build -t pratikshahp/withoutsquash:learn .
- To build an image with
--squash
flag
docker image build --squash -t pratikshahp/squash:learn .
- To check the difference in both images
docker images
Image size is 974 MB without squash –flag and 971 MB with –squash flag.
Conclusion
Optimizing images with a security extent and without losing quality has been a matter of concern. There are no specific calculations to optimize all images in the same way. Minimizing an image depends upon a particular application’s requirement and use case. This hands-on lab is just presenting different ways for image optimization. I would love to hear comments from more people for betterment.