Understanding Docker Image Optimization: Techniques for Effective Deployment

To build, manage and ship the docker image easily and occupy less space when pulled locally

Docker engine makes it possible to take a piece of code with all its dependency and run in a container using its image. The whole concept of image is based upon Dockerfile, a configuration file to build an image. It is good practice to follow some rules while creating Dockerfile.   

In this hands-on lab, with examples and hands-on experience, one will find the different ways to minimize images so it becomes easy to manage and ship images for different environments.

Why One Requires To Optimize The Image Size?

Unwanted files, libraries or recommended packages oversize the image and large images are difficult to manage and occupy more space when pulled locally.The main advantage of small Docker images is the cost reduction, small images use less memory space, can be shipped and downloaded faster which may result in reduced container start time. Larger images may have a source code which is not required while the image is running in the container.

From the cloud perspective, Smaller images can be uploaded to the cloud easily and are easy to maintain. Public cloud providers follow pay-as-you-go costing. To get the benefit one needs to minimize the image. From the Kubernetes perspective, The smaller the size of the image the faster the pod starts.

Pre-Requisite

To derive the most value from this blog, It is recommended to participate in hands-on activities with your local machine. First install Docker as per your operating system and proceed further.

Different Ways To Minimize The Docker Image Size

There are multiple ways to minimize the image size as per the requirements.

  1. Minimize the number of layers
  2. Do not install unnecessary packages
  3. After installation, do the clean-up
  4. Use the smallest base images
  5. Use .dockerignore file
  6. Multistage builds
  7. Store application data in another place
  8. Take benefit of caching
  9. Do not install editors using Dockerfile
  10. Use docker --squash flag at build time

The below-mentioned Dockerfile is taken as an example to decrease the image size, 

  • To install Docker in the ubuntu linux distribution,
apt update && apt install docker.io -y 
  • To Create a Dockerfile
vim Dockerfile 
FROM python:3.9
RUN apt-get update 
RUN apt-get install nginx -y
RUN apt-get install php -y
  • To build an image from Dockerfile
docker image build -t <dockerhub_username>/<image_name>:<tag> .
docker image build -t pratikshahp/python:learn .
  • To list the images
docker images
Figure 1: Example image
Figure 1: Example image

Image built with Simple Dockerfile: 975 MB

Minimize The Number Of Layers

In this technique, we must minimize the number of layers created at build time. Each instruction like FROM, COPY, ADD, RUN etc in Dockerfile creates its own layer and intermediate containers, which may increase the image size. 

Two or more  RUN commands with && operator creates a single layer and decreases the size of the resulting image.

FROM base_image
RUN apt-get update -y && apt install -y \
  package_one \
  package_two

Before minimizing the number of layers in earlier built image whose size was 975 MB. 

  • Replace previous Dockerfile with below mentioned code,
vim Dockerfile 
FROM python:3.9
RUN apt-get update && \
    apt-get install nginx php -y
  • To build the image
docker image build -t pratikshahp/python:learn .
  • To list the images
docker images
Figure 2: Image with minimize number of layers
Figure 2: Image with minimize number of layers

The image size is 972 MB (3 MB reduced)

Do not Install Unnecessary Packages

Install packages with apt-get one should add the --no-install-recommends tag. This will not install the recommended packages that come with the main. This will decrease the image size and not affect the quality of the required image.

FROM base_image
RUN apt-get update -y && apt install -y --no-install-recommends \ 
 [packages]

Install with recommended packages the image size was 972 MB. 

  • To modify Dockerfile
vim Dockerfile 
FROM python:3.9
RUN apt-get update && \
    apt-get install --no-install-recommends nginx php -y
  • To build the image
docker image build -t pratikshahp/python:learn .
  • To list the images
docker images
Figure 3: Image with no-install-recommends
Figure 3: Image with no-install-recommends

After Do not Install Recommends the size is 960 MB (more 12 MB reduced)

After Installation Do The Clean-Up

To install packages some temporary files are used. After installation is finished these files are no longer required. One can clean this up by adding rm -rf /var/lib/apt/lists/* to the same line from which packages are installed.

FROM base_image
RUN apt-get update -y && apt install -y --no-install-recommends \   
package_one \  
package_two \ 
&& rm -rf /var/lib/apt/lists/*

Without cleaning up after installation the image size was 960 MB.

  • Change the Dockerfile
vim Dockerfile 
FROM python:3.9
RUN apt-get update && \
    apt-get install --no-install-recommends nginx php -y \
    && rm -rf var/lib/apt/lists/*
  • To build the image
docker image build -t pratikshahp/python:learn .
  • To list the images
docker images
Figure 4: image with clean up after install
Figure 4: image with clean up after install

Clean Up after installation the size is 942 MB (more 18 MB reduced)

Use the Smallest Base Images

Base images fall into two main categories:

  • Base images that are created with full-scale oslike Ubuntu. 
  • The images which are based on minimal os like Alpine Linux.

Smaller base images make build operations significantly faster. They are more secure too, as they have a smaller attack surface with respect to size.

FROM base_image:alpine

Alpine is not always a good choice as a base image. Following are the issues with it.

  • Compatibility: Alpine has the limitation that it uses musl C-library instead of the much more common glibc C-library. This often causes compatibility problems, because Most Linux use glibc at the top of the Kernel Space to interact with user space. 
  • Dependencies: Alpine installed dependency (packages) with apk add instead of apt-get install. All packages may not be available with alpine.

Using a  full scale base image the size was 942 MB. 

  • Modify Dockerfile
vim Dockerfile 
FROM python:3.9-slim
RUN apt-get update && \
    apt-get install --no-install-recommends nginx php -y \
    && rm -rf var/lib/apt/lists/*
  • To build the image
docker image build -t pratikshahp/python:learn .
  • To list the images
docker images
Figure 5: Image with python-slim base image
Figure 5: Image with python-slim base image

After using python:3.9-slim (minimal base image of python) as a base image the size is 258 MB (more 684 MB reduced)

Use other smaller base image like alpine,

  • To Modify Dockerfile
vim Dockerfile 
FROM python:alpine
RUN apk update && \   
apk add nginx php
  • To build the image
docker image build -t pratikshahp/python:learn .
  • To list the images
docker images
Figure 6: Image with alpine as a base image
Figure 6: Image with alpine as a base image

After using alpine as a base image the size is only 71.5MB (more 186.5MB reduced with lesser build time)

Note:It is recommended to use alpine as a base image if one needs to use less disk space.

Use .dockerignore File

Dockerfile is a text file that is used to build the customized image based on the instruction written in.The .dockerignore file is a list of files that do not attach with an image while building.It is necessary to add a .dockerignore file in the same folder where the Dockerfile resides.

.dockerignore file.

password.txt
secret.txt
logs
.git
*.md
.cache

How much size is decreased may depend upon requirements of applications and the file size one includes in a .dockerignore file.           

Alert: One may include Dockerfile in .dockerignore file. While building an image it will get an error message.

Multistage Builds

Multistage build is used to optimise Dockerfiles so it can be easy to read, maintain and modify. One can use multiple Dockerfile for development and production. To reduce the resulting image size we need to write a multi stage build. Developers will not share the source to the client; they package the binary only. 

For Example, To run a C program, build an image with a simple Dockerfile and with a multistage Dockerfile and check the size of images.

  • To create a C file
vim hello.c
#include <stdio.h>
int main() 
{        
printf("Hello Everyone!!\nWelcome to Home Page\n");   
return 0;
}
  • To create a script file
vim myscript.sh
#!/bin/sh
export DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket
# run our binary
/usr/src/app/hello
  • To write a Dockerfile
vim Dockerfile
FROM ubuntu AS buildstep
RUN apt-get update && apt-get install -y build-essential gcc
WORKDIR /usr/src/app
COPY ./hello.c hello.c
COPY ./myscript.sh /app/myscript.sh
RUN gcc -o hello hello.c && chmod +x hello
ENV INITSYSTEM=on
CMD ["bash", "/app/myscript.sh"]
  • To build an image from a simple Dockerfile
docker image build -t pratikshahp/simpleimg:capp . 
  • To run the container from the previous image
docker container run pratikshahp/simpleimg:capp
Figure 7: Running Container with simple image
Figure 7: Running Container with simple image
  • To write Multistage Dockerfile
vim MultiStageDockerfile
FROM ubuntu AS buildstep
RUN apt-get update && apt-get install -y build-essential gcc
COPY hello.c /app/hello.c
WORKDIR /app
RUN gcc -static -o hello hello.c && chmod +x hello

FROM alpine
RUN mkdir -p /usr/src/app/
WORKDIR /usr/src/app
COPY --from=buildstep /app/hello ./hello
COPY ./myscript.sh ./myscript.sh
CMD ["/bin/sh", "/usr/src/app/myscript.sh"]
  • To build an image from a Multi Stage Dockerfile
docker build -t pratikshahp/multistagebuiltimg:capp -f MultiStageDockerfile .
  • To run the container from the previous multistage build image
docker container run pratikshahp/multistagebuiltimg:capp
Figure 8: Running container with Multistage build image
Figure 8: Running container with Multistage build image

Both containers run with different images and execute a “C” Program the same way!

  • To verify the size of both images simple image and multistage build image
docker images
Figure 9: verification of image size
Figure 9: verification of image size

With a simple image the size is 394 MB and with multistage the image size is significantly reduced (7.95 MB only). 

Store Application Data In Another Place

Store application data in the image while building, will increase the size. To overcome that, use the volume feature of the container runtimes to isolate the image and application data.

Take a benefit of Caching

To use the cache efficiently one should write the correct order of instruction in Dockerfile. If the middle step changes, all the following steps have to be rebuilt.

Figure 10: Caching used in image building
Figure 10: Caching used in image building
FROM Ubuntu
WORKDIR /<dir_name>
COPY . .          # Copy all files in the current directory
RUN apt-get install package_one package_two # Install packages
RUN apt-build [options]                   # Run build

Write expensive commands near the starting of the Dockerfile and write the commands that change frequently near the end of the Dockerfile, to minimize the rebuilds.

In the above scenario change in any file of the project will reinstall all packages while every new image builds even though  the packages and management files  didn’t change.

Now change the scenario, and split the COPY command in two parts. 

  • copy the package management files and install the packages. 
  • copy the project files, which may change.
FROM Ubuntu
WORKDIR /<dir_name>
COPY source.list dpkg .            # Copy package management files
RUN apt-get install package_one package_two # Install packages
COPY . .                                 # Copy project files
RUN apt-build [options]                  # Run build

Note: Caching will not reduce the image size but it reduces the build time.

  • Take a Dockerfile
vim Dockerfile
FROM python:3.9
RUN apt-get update && apt-get install nginx -y
RUN apt-get install php -y
WORKDIR /myapp/
RUN date
COPY . .
  • To build the image
docker image build -t pratikshahp/img:learn .
  • Now, change last two layers of Dockerfile.
vim Dockerfile
FROM python:3.9
RUN apt-get update && apt-get install nginx -y
RUN apt-get install php -y
WORKDIR /myapp/
COPY . .
RUN date

Again build the image with modified Dockerfile

docker image build -t pratikshahp/imgcache:learn .
Figure 11: Use of Cache
Figure 11: Use of Cache
  • One more time modify Dockerfile from 3rd layer
vim Dockerfile
FROM python:3.9
RUN apt-get update && apt-get install nginx -y
RUN date
RUN apt-get install php -y
WORKDIR /myapp/
COPY . .
  • To build the image
docker image build -t pratikshahp/testimg:learn .
Figure 12: Verify correct sequence of layers
Figure 12: Verify correct sequence of layers

Caching is not used after the 3rd layer and php package will install again.

Do not Install Editors Using Dockerfile

Developers may use the tools like, curl/vim/nano in the Dockerfiles, which results in a large image. To get the benefits of debugging inside the container install it in the development phase and remove it after finishing.

Use Docker –squash Flag At Build Time

Build the Docker image using the --squash flag to squash some Docker layers and create a resulting image with fewer layers.

  • First check docker version
docker version

In server configuration, if experimental : false then –squash will not work. To enable this,  

  • Create a /etc/docker/daemon.json file
vim /etc/docker/daemon.json
{
    "experimental": true
}
  • Restart docker daemon to reflect the change
systemctl restart docker 
  • To check experimental value
docker version -f '{{.Server.Experimental}}'

If it gets true then after use --squash flag.

If dockerfile is in same directory,

docker image build --squash -t <imagename> .

Using same Dockerfile create two different images

vim Dockerfile
FROM python:3.9
RUN apt-get update
RUN apt-get install nginx -y
RUN apt-get install php -y
  • To build an image without --squash flag
docker image build -t pratikshahp/withoutsquash:learn .
  • To build an image with --squash flag
docker image build --squash -t pratikshahp/squash:learn .
  • To check the difference in both images
docker images
Figure 13: Verify --squash flag
Figure 13: Verify –squash flag

Image size is 974 MB without squash –flag and 971 MB with –squash flag.

Conclusion

Optimizing images with a security extent and without losing quality has been a matter of concern. There are no specific calculations to optimize all images in the same way. Minimizing an image depends upon a particular application’s requirement and use case. This hands-on lab is just presenting different ways for image optimization. I would love to hear comments from more people for betterment.

Join Our Newsletter

Share this article:

Table of Contents