Image Optimization
Optimizing Docker images improves build times, reduces storage costs, and speeds up deployments. This lesson covers techniques to create lean, efficient images.
Why Optimize?
| Metric | Impact of Large Images |
|---|---|
| Build time | Slower CI/CD pipelines |
| Pull time | Slower container starts |
| Storage | Higher registry costs |
| Network | More bandwidth usage |
| Security | More potential vulnerabilities |
Base Image Selection
Size Comparison
┌────────────────────────────────────────────────────────────────┐
│ Base Image Sizes (approximate) │
├────────────────────────────────────────────────────────────────┤
│ │
│ ubuntu:22.04 ████████████████████████████ 77MB │
│ debian:bookworm ██████████████████████████████ 139MB │
│ python:3.11 ████████████████████████████████████ 1GB │
│ python:3.11-slim ██████████████████ 125MB │
│ python:3.11-alpine ████ 52MB │
│ node:18 █████████████████████████████████████ 1.1GB │
│ node:18-slim ██████████████████████ 241MB │
│ node:18-alpine █████████████ 175MB │
│ alpine:3.19 █ 5MB │
│ scratch (0MB) │
│ distroless/base ██ 20MB │
│ │
└────────────────────────────────────────────────────────────────┘
Choosing the Right Base
# For general applications - slim variants
FROM python:3.11-slim
FROM node:18-slim
# For minimal size - Alpine
FROM python:3.11-alpine
FROM node:18-alpine
# For static binaries - scratch
FROM scratch
COPY myapp /myapp
CMD ["/myapp"]
# For security - distroless
FROM gcr.io/distroless/nodejs18-debian11
Alpine Considerations
# Alpine uses musl libc instead of glibc
# Some packages need compilation
FROM python:3.11-alpine
# May need build dependencies
RUN apk add --no-cache --virtual .build-deps \
gcc \
musl-dev \
&& pip install --no-cache-dir package \
&& apk del .build-deps
Multi-Stage Builds
Separate build dependencies from runtime:
# Build stage - includes all build tools
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage - minimal
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/index.js"]
Size Impact
Single stage: 1.1 GB (node:18 + all deps)
Multi-stage: 180 MB (node:18-alpine + production deps only)
Reduction: 84%
Layer Optimization
Combine Commands
# Bad - 4 layers, intermediate files persist
RUN apt-get update
RUN apt-get install -y curl wget
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*
# Good - 1 layer, cleanup included
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
curl \
wget \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
Remove Unnecessary Files
RUN pip install --no-cache-dir -r requirements.txt
RUN npm ci --only=production && npm cache clean --force
RUN apt-get install -y package \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
.dockerignore
Exclude files from build context:
# .dockerignore
.git
.gitignore
node_modules
npm-debug.log
Dockerfile*
docker-compose*
.dockerignore
README.md
.env*
*.log
*.md
.DS_Store
coverage
.nyc_output
test
tests
__tests__
*.test.js
*.spec.js
docs
.vscode
.idea
Impact
# Without .dockerignore
Sending build context to Docker daemon 500MB
# With .dockerignore
Sending build context to Docker daemon 2.5MB
Caching Strategies
Layer Order
# Optimal order (least to most frequently changing)
FROM node:18-alpine
# 1. System configuration (rarely changes)
WORKDIR /app
# 2. Dependencies (changes when deps change)
COPY package*.json ./
RUN npm ci --only=production
# 3. Application code (changes frequently)
COPY . .
# 4. Metadata (never changes content)
EXPOSE 3000
CMD ["node", "app.js"]
External Cache
# Cache to registry
docker build --cache-from myregistry/myapp:latest -t myapp:new .
# BuildKit cache
docker build \
--build-arg BUILDKIT_INLINE_CACHE=1 \
-t myapp .
Analyzing Image Size
Docker Commands
# Image size
docker images myapp
# Layer sizes
docker history myapp
# Detailed history
docker history --no-trunc myapp
# Disk usage
docker system df -v
Using dive
# Install dive
brew install dive # macOS
# Analyze image
dive myapp:latest
Using Docker Scout
# Quick summary
docker scout quickview myapp
# Detailed analysis
docker scout cves myapp
docker scout recommendations myapp
Language-Specific Optimization
Node.js
FROM node:18-alpine
WORKDIR /app
# Only production deps
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force
# Don't copy source if built separately
COPY dist/ ./dist/
USER node
CMD ["node", "dist/index.js"]
Python
FROM python:3.11-slim
WORKDIR /app
# Install deps without cache
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy only needed files
COPY app/ ./app/
COPY main.py .
USER nobody
CMD ["python", "main.py"]
Go
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY go.* ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o main .
# Smallest possible - just the binary
FROM scratch
COPY --from=builder /app/main /
EXPOSE 8080
ENTRYPOINT ["/main"]
Compression
Squash Layers (Experimental)
docker build --squash -t myapp .
Compress at Build
# Compress static assets
RUN gzip -r /app/public
Monitoring Size Over Time
# Compare image sizes
docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"
# Track in CI/CD
docker images myapp:$CI_COMMIT_SHA --format "{{.Size}}"
Size Targets
| Application Type | Target Size |
|---|---|
| Static binary (Go, Rust) | < 20 MB |
| Node.js API | < 200 MB |
| Python API | < 200 MB |
| Full-stack app | < 500 MB |
| ML application | < 2 GB |
Optimization Checklist
- Using appropriate base image (slim/alpine)
- Multi-stage build implemented
- .dockerignore configured
- Commands combined to reduce layers
- Cache cleaned after package installation
- Production dependencies only
- Unused files removed
- Build context minimized
Key Takeaways
- Use slim or Alpine base images
- Multi-stage builds dramatically reduce size
- Order layers by change frequency
- Always use .dockerignore
- Clean package manager caches
- Install only production dependencies
- Analyze images with dive or Docker Scout
- Set size targets and monitor over time

