Optimizing Docker Image Size For Real

I’ve come across tips on how to keep Docker images small and Dockerfiles with strange lines that seem to exist only to optimize image size. Well, it turns out they’re all wrong.

They may have an effect with flat Docker images, but everything else (i.e. 99% of what people do), cleanup steps are just extra steps. When Docker builds an image from a Dockerfile, every step is a checkpoint, and every step is saved. If you add 100 MB in one step, then delete it the next, that 100 MB still needs to be saved so other Dockerfiles with the same step can reuse it.

Results

REPOSITORY               TAG             IMAGE ID            CREATED             VIRTUAL SIZE
test/baseline            latest          7b590dec9b43        7 hours ago         272.6 MB
test/baseline_lines      latest          e165025980f7        9 minutes ago       272.6 MB
test/baseline_lists      latest          b40f9e108a93        About an hour ago   272.6 MB
test/combo               latest          744b502e0052        2 seconds ago       269.8 MB
test/combo2              latest          be8f1c1de02e        About an hour ago   249.8 MB
test/combo3              latest          da948e2838d9        About an hour ago   249.8 MB
test/install             latest          e7cadcbb5a05        12 hours ago        269.8 MB
test/install_clean       latest          dd1383285e85        12 hours ago        269.8 MB
test/install_lists       latest          e55f6f8ebac8        12 hours ago        269.8 MB
test/purge               latest          ef8c2aa7400b        About an hour ago   273.5 MB
test/remove              latest          75e3e5c4e246        About an hour ago   273.5 MB

Hypothesis: Docker’s base Ubuntu image does not need `apt-get clean`

I did an experiment around Docker 0.6. I think my conclusion was that `apt-get install … && apt-get clean` saved a few megabytes. But I head that you didn’t need to do that. If you compare the “test/install” and “test/install_clean” size, you’ll see there is no difference. So you don’t need `apt-get clean`.

Hypothesis: `rm -rf /var/lib/apt/lists/*` saves some space

I’ve been seeing a lot of Dockerfiles lately with this line. Including lots of official Docker images. If those guys are all doing it, surely it must have some effect. Nope.

Hypothesis: Combining similar lines saves space

There’s some overhead for each line in a Dockerfile. How significant is it? Well, it turns out it’s not. What I did find out though, is that it does save a significant amount of time and saves a lot of disk thrashing. So combining lines does not save space, but saves time.

Hypothesis: Combining multiple steps saves space

This makes sense. If you skip making checkpoints, you’re not storing intermediate states. And it turns out this is the only way to get a Docker image made from a Dockerfile smaller. But this is at the cost of readability, and more importantly, at the cost of reduced redundancy between images.

Hypothesis: `apt-get purge` saves some space

Well this hypothesis seems silly now. But I see it used now and then. Deletions do not save space.

Conclusion

Write your Dockerfiles the same way you run commands. Don’t prematurely optimize by adding extra cruft you saw someone else do. If you’re actually worried about image size, use some sort of automation to rebuild Docker images behind the scenes. Just keep that logic out of the Dockerfile. And always keep on measuring. Know your bottlenecks.

1 Comment on "Optimizing Docker Image Size For Real"

Leave a Reply

Your email address will not be published. Required fields are marked *