Week 3 | Lesson 13

Development, Deployment & Monitoring

Docker, Containerization, Deployment, Logging, Metrics, Observability, Resilient Applications



© 2026 by Monika Protivová

Docker Basics

What is Docker?

Docker is a platform for developing, shipping, and running applications in containers

Docker enables you to package your application with all its dependencies into a standardized unit called a container. Containers are lightweight, standalone, and executable packages that include everything needed to run your software.

Key Concepts

  • Container - A running instance of a Docker image, isolated from the host system
  • Image - A read-only template containing application code, runtime, libraries, and dependencies
  • Dockerfile - A text file with instructions for building a Docker image
  • Registry - A storage and distribution system for Docker images (e.g., Docker Hub)

💡 ️ Think of a Docker image as a recipe (Dockerfile), and a container as the meal you cook from that recipe.

Docker Architecture

Understanding Docker's core components and how they work together

Docker Components

  • Docker Engine - The runtime that builds and runs containers
  • Docker Client - The CLI tool (docker command) you use to interact with Docker
  • Docker Daemon - Background service that manages containers
  • Docker Hub - Public registry for sharing Docker images

How Docker Works

  1. You write a Dockerfile describing your application
  2. Docker builds an image from the Dockerfile
  3. You run a container from the image
  4. The container runs your application in isolation

Containers vs Virtual Machines

  • Containers share the host OS kernel (lightweight)
  • VMs include a full OS (heavy)
  • Containers start in seconds, VMs take minutes
  • Containers use less disk space and memory

Why Use Docker?

Docker solves the 'it works on my machine' problem and more

Benefits

  • Consistency - Same environment from development to production
  • Isolation - Applications run independently without conflicts
  • Portability - Run anywhere Docker is installed
  • Efficiency - Lightweight, fast startup, efficient resource usage
  • Scalability - Easy to replicate and scale applications
  • Version Control - Images can be versioned and rolled back

Use Cases

  • Microservices architecture
  • CI/CD pipelines
  • Development environments
  • Application deployment
  • Testing in isolated environments
  • Cloud-native applications

Essential Docker Commands

Common Docker CLI commands for working with images and containers

Image Commands

# Build an image from Dockerfile
docker build -t myapp:1.0 .

# List images
docker images

# Pull image from registry
docker pull postgres:15

# Remove an image
docker rmi myapp:1.0

Container Commands

# Run a container
docker run -d -p 8080:8080 --name myapp myapp:1.0

# List running containers
docker ps

# List all containers (including stopped)
docker ps -a

# Stop a container
docker stop myapp

# Start a stopped container
docker start myapp

# Remove a container
docker rm myapp

# View container logs
docker logs myapp

# Execute command in running container
docker exec -it myapp /bin/bash

Dockerfile

What is a Dockerfile?

A Dockerfile is a text document containing instructions for building a Docker image

A Dockerfile defines the steps Docker takes to create an image. Each instruction creates a layer in the image, and Docker caches these layers for efficiency.

Basic Dockerfile Structure

# Start from a base image
FROM openjdk:17-jdk-slim

# Set the working directory
WORKDIR /app

# Copy files into the image
COPY target/myapp.jar app.jar

# Expose a port
EXPOSE 8080

# Define the command to run
ENTRYPOINT ["java", "-jar", "app.jar"]

Key Principles

  • Instructions are executed in order
  • Each instruction creates a new layer
  • Layers are cached for faster rebuilds
  • Order matters for cache optimization

Common Dockerfile Instructions

FROM - Sets the base image (must be first instruction)
FROM openjdk:17-jdk-slim
WORKDIR - Sets the working directory for subsequent instructions
WORKDIR /app
COPY - Copies files from host to image
COPY target/*.jar app.jar
RUN - Executes commands during image build
RUN apt-get update && apt-get install -y curl
EXPOSE - Documents which port the container listens on
EXPOSE 8080
ENV - Sets environment variables
ENV SPRING_PROFILES_ACTIVE=prod
ENTRYPOINT vs CMD
ENTRYPOINT ["java", "-jar", "app.jar"]
CMD ["--spring.profiles.active=dev"]
ENTRYPOINT - Defines the executable to run
CMD ["--spring.profiles.active=dev"]
ADD - Similar to COPY but with extra features (e.g., extracting tar files, fetching from URLs)
ADD https://example.com/file.tar.gz /app/
VOLUME - Creates a mount point for external storage
VOLUME /data

Dockerfile for Spring Boot

Creating an optimized Dockerfile for Spring Boot applications

Simple Spring Boot Dockerfile

FROM openjdk:17-jdk-slim

WORKDIR /app

# Copy the JAR file
COPY build/libs/*.jar app.jar

# Expose the application port
EXPOSE 8080

# Set JVM options
ENV JAVA_OPTS="-Xmx512m -Xms256m"

# Run the application
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar app.jar"]

Building and Running

# Build the application first
./gradlew build

# Build Docker image
docker build -t myapp:1.0 .

# Run the container
docker run -p 8080:8080 myapp:1.0

Multi-stage Builds

Use multi-stage builds to create smaller, more secure images

Multi-stage builds allow you to build your application in one stage and copy only the necessary artifacts to the final image, resulting in smaller images.

# Build stage
FROM gradle:8-jdk17 AS build
WORKDIR /app
COPY . .
RUN gradle build --no-daemon

# Runtime stage
FROM openjdk:17-jdk-slim
WORKDIR /app

# Copy only the built JAR from build stage
COPY --from=build /app/build/libs/*.jar app.jar

EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]

Benefits

  • ✓ Smaller final image (no build tools)
  • ✓ More secure (fewer attack surfaces)
  • ✓ Faster deployments
  • ✓ No need to build locally

📝 ️ Multi-stage builds are the recommended approach for production applications.

Docker Compose

What is Docker Compose?

Docker Compose is a tool for defining and running multi-container Docker applications

Docker Compose uses a YAML file to configure your application's services, networks, and volumes. With a single command, you can create and start all services from your configuration.

Why Docker Compose?

  • Multi-container Management - Run multiple containers as a single application
  • Simple Configuration - Define everything in one docker-compose.yml file
  • Networking - Automatic network creation for service communication
  • Development Workflow - Perfect for local development environments

📝 ️ Common use case: Running your Spring Boot application with PostgreSQL, Redis, and other services locally.

docker-compose.yml Structure

Understanding the Docker Compose file format

Basic docker-compose.yml

version: '3.8'

services:
  app:
    build: .
    ports:
      - "8080:8080"
    environment:
      - SPRING_PROFILES_ACTIVE=dev
    depends_on:
      - db

  db:
    image: postgres:15
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_DB=myapp
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Key Sections

  • services - Defines containers to run
  • volumes - Persistent storage
  • networks - Custom networks (optional)

Spring Boot with Database

Complete example of Spring Boot application with PostgreSQL
version: '3.8'

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: myapp
    ports:
      - "8080:8080"
    environment:
      - SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/myapp
      - SPRING_DATASOURCE_USERNAME=user
      - SPRING_DATASOURCE_PASSWORD=password
      - SPRING_JPA_HIBERNATE_DDL_AUTO=update
    depends_on:
      db:
        condition: service_healthy
    networks:
      - app-network

  db:
    image: postgres:15-alpine
    container_name: myapp-db
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_DB=myapp
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres_data:/var/lib/postgresql/data
    networks:
      - app-network
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:

networks:
  app-network:
    driver: bridge

Docker Compose Commands

Essential commands for working with Docker Compose
  • docker-compose up
    Builds, (re)creates, starts, and attaches to containers for a service
  • docker-compose up -d
    Starts services in detached mode (in the background)
  • docker-compose build
    Builds or rebuilds services
  • docker-compose stop
    Stops running containers without removing them
  • docker-compose down
    Stops and removes containers, networks, volumes, and images created by up.
  • docker-compose down -v
    Stops and removes containers, networks, and volumes
  • docker-compose restart
    Restarts services
  • docker-compose ps
    Lists containers
  • docker-compose logs
    Views output from containers
  • docker-compose logs -f app
    Follows logs for the 'app' service
  • docker-compose exec app bash
    Executes a command in a running 'app' service container
  • docker-compose up -d --scale app=3
    Scales the 'app' service to 3 instances

Development & Environments

Environment Progression

Applications move through multiple environments from development to production, each serving a specific purpose in the delivery pipeline.
Local Developer's Machine Dev Integration Testing Staging Pre-Production Production Live Users
  • Local (Development) - Individual developer's machine. Fast feedback loop, debugging enabled, mock services and data. Rapid iteration and experimentation.
  • Development (Dev) - Shared integration environment. Latest code from all developers, real database instances (dev), integration with other services. Testing feature integration.
  • Staging (QA/Pre-Production) - Production-like environment. Final testing before release, performance and load testing, QA validation. Should mirror production configuration.
  • Production - Live environment serving real users. Highest security and stability requirements, comprehensive monitoring and alerting, automated backups and disaster recovery.

Environment Configuration & Best Practices

Each environment requires different configuration and careful management to ensure safety and consistency.

Environment-Specific Configuration

Each environment has different:

  • Database connections
    Separate databases or schemas for each environment
  • API endpoints
    Different URLs for external services (payment, email, etc.)
  • Feature flags
    Enable/disable features per environment
  • Logging levels
    DEBUG in dev, INFO/WARN in production
  • Resource limits
    Memory, CPU, connection pools scaled per environment
  • Security settings
    SSL/TLS, CORS policies, authentication requirements

Best Practices

  • Environment parity
    Keep environments as similar as possible. "Works on my machine" should mean "works in production"
  • Configuration as code
    Store environment configs in version control (not secrets!)
  • Secrets management
    Never commit secrets. Use environment variables, vault services (HashiCorp Vault, AWS Secrets Manager)
  • Infrastructure as code
    Define infrastructure with tools like Terraform, CloudFormation
  • Immutable infrastructure
    Deploy new versions, don't modify running systems
⚠️ ️ Never use production data in development or staging environments. Use anonymized data or synthetic test data to protect user privacy and comply with regulations (GDPR, CCPA).

Development Tools & Practices

Modern Kotlin development relies on a robust toolchain for productivity and code quality.

Core Development Tools

  • IntelliJ IDEA
    Primary IDE with excellent Kotlin support, debugging, refactoring
  • Gradle
    Build automation tool (Kotlin DSL preferred over Groovy)
  • Git
    Version control with feature branches and pull requests
  • Docker
    Local containerization matching production environment

Testing Tools

  • Kotest or JUnit 5 - Testing frameworks
  • MockK - Mocking library
  • Testcontainers - Integration testing with real databases

Code Quality Tools

  • ktlint
    Code formatting and style checking
  • detekt
    Static code analysis for Kotlin
  • SonarQube
    Continuous code quality inspection

Development Practices

  • Write tests alongside code (TDD)
  • Use hot reload for faster development
  • Run linters pre-commit
  • Keep dependencies up to date
  • Use environment-specific profiles
💡 ️ Enable Spring Boot DevTools for automatic restart and live reload during development. Add dependency: org.springframework.boot:spring-boot-devtools

Build & Packaging

Build Process & Gradle

Gradle orchestrates the entire build lifecycle from compilation to testing to packaging.
clean Remove artifacts ./gradlew clean compile Source → Classes ./gradlew compileKotlin test Run tests ./gradlew test check Quality checks ./gradlew check build Full build ./gradlew build bootJar Create JAR ./gradlew bootJar bootRun Run app ./gradlew bootRun ← Test failures stop build

Key files in Gradle projects:

  • build.gradle.kts - Build configuration (dependencies, plugins, tasks)
  • settings.gradle.kts - Project structure and modules
  • gradle.properties - Build properties and JVM options
  • gradlew - Gradle wrapper (ensures consistent version)
❗ ️ Always use the Gradle wrapper (./gradlew) instead of a globally installed Gradle. This ensures everyone uses the same Gradle version.

Packaging Spring Boot Applications

Spring Boot applications are usually packaged as executable Fat JAR files that bundle everything needed to run the application.

JAR stand for Java ARchive - a package format for Java/Kotlin applications, but it is essentially a ZIP file with a specific structure and metadata.

Executable JAR

Also called a "Fat JAR" or "Uber JAR", it contains:

  • Application code
    Your compiled Kotlin/Java classes
  • All dependencies
    Every library bundled inside (Spring Boot, Kotlin stdlib, Jackson, etc.)
  • Embedded web server
    Tomcat, Jetty, or Undertow - no external server needed
  • Configuration & resources
    application.yml, static files, templates
  • It is felf-contained and portable (ideal for containerization)
  • You cam run it with: java -jar app.jar
myapp.jar (Executable Fat JAR) Application Code Your Kotlin classes, controllers, services Dependency JARs spring-boot.jar, kotlin-stdlib.jar, jackson.jar, hibernate.jar, ... and all other dependencies Embedded Web Server Tomcat / Jetty / Undertow Configuration & Resources (application.yml, static/*, templates/*)

Application Configuration

Spring Boot supports multiple configuration formats and environment-specific profiles for flexible deployment.

Configuration Files

application.properties (traditional)

server.port=8080 spring.datasource.url=jdbc:postgresql://localhost/mydb spring.datasource.username=user logging.level.root=INFO

application.yml (preferred - more readable)

server: port: 8080 spring: datasource: url: jdbc:postgresql://localhost/mydb username: user logging: level: root: INFO

Spring Profiles

Environment-specific configuration:

  • application-dev.yml - Development
  • application-staging.yml - Staging
  • application-prod.yml - Production

Activate with:

# Environment variable SPRING_PROFILES_ACTIVE=prod # Command line java -jar app.jar --spring.profiles.active=prod # In application.yml spring: profiles: active: dev
⚠️ ️Never commit sensitive data: Use environment variables or secret management tools (Vault, AWS Secrets Manager) for passwords, API keys, and tokens.

Configuration Priority & Overrides

Spring Boot uses a well-defined priority hierarchy to resolve configuration values. Higher priority sources override lower priority ones.
↓ HIGHEST PRIORITY ↓ 1 Command Line Arguments --server.port=8080, --spring.profiles.active=prod 2 OS Environment Variables SPRING_DATASOURCE_URL, DATABASE_PASSWORD 3 Profile-Specific Files application-prod.yml, application-dev.yml 4 application.yml Base configuration file 5 @PropertySource Annotations Custom property files in code 6 Default Values in Code Hardcoded defaults in @Value annotations ↑ LOWEST PRIORITY ↑

Common Use Cases

  • Temporary overrides
    Quick testing without changing files
  • Production secrets
    Store in environment variables or secret managers (AWS Secrets, Kubernetes secrets)
  • Environment-specific settings
    Database URLs, API endpoints, feature flags in application-prod.yml
  • Developer defaults
    Sensible defaults in application.yml that work for local development

❗ ️Configuration Best Practice: Start with good defaults in application.yml. Use profile files for environment differences. Store secrets in environment variables or secret management tools. Use command line arguments only for temporary overrides during testing.

Build Automation & CI/CD

Continuous Integration and Continuous Deployment automate building, testing, and deploying applications, catching issues early and enabling rapid, reliable releases.

Problems CI/CD Solves

  • "Works on my machine" - Everyone uses the same build environment
  • Forgotten tests - Tests run automatically on every commit
  • Integration issues - Code is integrated and tested continuously
  • Manual deployment errors - Deployments are automated and consistent
  • Slow feedback - Developers get immediate feedback on code quality

Key Benefits

  • Catch bugs early before they reach production
  • Ship features faster with confidence
  • Reduce manual work and human error
  • Enforce quality standards automatically
  • Enable multiple deployments per day
  • Improve team collaboration
Source Control Git Push Build Compile & Resolve Test Unit & Integration Package Create Artifacts Deploy To Environment Verify Health Checks ← Failure triggers rebuild

Common CI/CD Tools

  • GitHub Actions - Built into GitHub, YAML-based workflows
  • GitLab CI/CD - Integrated with GitLab, powerful pipelines
  • Jenkins - Self-hosted, highly customizable
  • CircleCI / Travis CI - Cloud-based CI/CD platforms
❗ ️Artifacts & Version Management: Artifacts (often docker image) are stored in artifact repositories and are versioned using semantic versioning (1.2.3).
This allows for re-deploying specific versions if needed.

Deployment Strategies

Blue-Green Deployment

Blue-Green deployment eliminates downtime by running two identical production environments and switching traffic instantly.

How It Works

Two production environments run in parallel. The load balancer controls which environment receives user traffic. Deploy to the inactive environment, verify it works correctly, then flip the switch. The old environment stays ready for instant rollback.

Blue v1.0 Green v2.0 (new) Load Balancer switch → Users

Advantages

  • Zero downtime deployment
  • Instant rollback (just switch back)
  • Test new version in production-like environment
  • Simple and reliable process
  • Reduces deployment risk significantly

Disadvantages

  • Requires double infrastructure (costly)
  • Database migrations need careful planning
  • Shared resources (DB) must be compatible with both versions
  • Not suitable for stateful applications without planning
❗ ️When to Use: Best for applications where you can afford double infrastructure and need absolute reliability. Common in financial services, e-commerce, and critical enterprise applications. Perfect when you need instant rollback capability and zero downtime is essential.

Canary Deployment

Canary deployment gradually rolls out changes to a small percentage of users first, allowing you to detect issues before full deployment.

How It Works

The canary approach tests new code with real users in production at minimal risk. The router directs a small percentage of requests to the new version while most traffic stays on the stable version. You monitor the canary closely and gradually increase traffic as confidence grows.

v1.0 v1.0 v1.0 v2.0 Router 95% / 5% Users

Advantages

  • Early issue detection with real user traffic
  • Reduced blast radius (only 5% of users affected)
  • Real production feedback before full rollout
  • No duplicate infrastructure needed
  • Gradual confidence building

Disadvantages

  • Requires sophisticated traffic routing system
  • Monitoring and metrics complexity
  • Slower rollout than blue-green
  • Need to handle mixed versions
❗ ️When to Use: Ideal for high-traffic applications where you want to validate changes with real users before full rollout. Common at Netflix, Facebook, Google for large-scale deployments. Best when you have good monitoring and can afford slower deployments for safety.

Rolling Update Deployment

Rolling updates gradually replace instances one-by-one or in batches, maintaining service availability throughout the deployment.

How It Works

Instances are updated in a controlled sequence while the service remains available. The load balancer continues routing traffic to healthy instances. Each instance is taken offline, updated, verified, and returned to service before moving to the next.

Stage 1: v1 v1 v1 v1 Stage 2: v2 v1 v1 v1 Stage 3: v2 v2 v2 v2

Advantages

  • No duplicate infrastructure needed
  • Gradual rollout reduces risk
  • Can pause deployment if issues detected
  • Built into Kubernetes and most orchestrators
  • Resource efficient

Disadvantages

  • Multiple versions running simultaneously
  • Slower than blue-green deployment
  • Rollback requires another rolling update
  • Requires backward-compatible versions
❗ ️When to Use: Default deployment strategy for Kubernetes and container orchestration platforms. Provides a good balance between safety and resource efficiency. Best when versions are backward compatible and you can tolerate gradual rollout times.

Feature Flags (Feature Toggles)

Feature flags decouple code deployment from feature release, allowing you to control feature visibility independently.

How It Works

Code deployment and feature activation are independent. Deploy the code first, then control when users see new features. Toggle flags ON/OFF instantly through a management interface. This enables safe experimentation and gradual rollouts.

if (featureFlags.isEnabled("newCheckout")) { processNewCheckout(order) } else { processOldCheckout(order) }
Deploy Code v2.0 (feature OFF) Feature Flag Toggle ON/OFF Feature Live Visible to users Day 1 Day 1 (later) Day 2

Advantages

  • Decouple deployment from feature release
  • Instant rollback (just toggle OFF)
  • A/B testing and gradual rollouts
  • Test in production safely
  • Enable features for specific users or percentages

Disadvantages

  • Code complexity with conditionals
  • Technical debt if flags not cleaned up
  • Requires feature flag infrastructure
  • Multiple code paths to test and maintain
❗ ️When to Use: Extremely powerful for continuous delivery and experimentation. Use for risky features, A/B tests, or gradual rollouts to specific user segments. Clean up old flags after features are stable to avoid code bloat. Popular tools: LaunchDarkly, Split.io, Unleash.

Logging Strategies

Why Logging Matters

Logs are essential for debugging, monitoring, and understanding application behavior in production

Logging provides visibility into your application's runtime behavior. Good logging practices help you diagnose issues, track performance, and audit system activity.

What to Log

  • ✅ Application Events - Startup, shutdown, configuration changes
  • ✅ Business Logic - Important business transactions and decisions
  • ✅ Errors and Exceptions - All errors with context and stack traces
  • ✅ Performance Metrics - Slow operations, query times, external API calls
  • ✅ Security Events - Authentication attempts, authorization failures, suspicious activity
  • ✅ Integration Points - External service calls, database queries, message queue operations

What NOT to Log

  • ❌ Passwords or credentials
  • ❌ Personal Identifiable Information (PII)
  • ❌ Credit card or payment information
  • ❌ Session tokens or API keys

Log Levels

Use appropriate log levels to categorize messages by severity

Standard Log Levels

From most to least severe

  • ERROR - Something went wrong that prevents normal execution
  • WARN - Something unexpected happened, but the application can continue
  • INFO - Important business events and milestones
  • DEBUG - Detailed information useful for debugging
  • TRACE - Very detailed information, typically only enabled temporarily

📝 ️ In production, typically set log level to INFO or WARN. Use DEBUG and TRACE only when troubleshooting specific issues.

Structured Logging

Structured logging outputs logs in a machine-readable format (usually JSON), making it easier to search, filter, and analyze logs in log management systems.

Traditional vs Structured Logging

// Traditional logging (text) logger.info("User john@example.com created order #12345 for $99.99") // Structured logging (JSON-friendly) logger.info( "Order created", kv("userId", "john@example.com"), kv("orderId", 12345), kv("amount", 99.99), kv("currency", "USD") )

JSON Output

{
  "timestamp": "2024-03-15T10:30:45.123Z",
  "level": "INFO",
  "message": "Order created",
  "userId": "john@example.com",
  "orderId": 12345,
  "amount": 99.99,
  "currency": "USD",
  "service": "order-service",
  "traceId": "abc-123-def-456"
}

Benefits

  • Easy to parse and analyze
  • Search by specific fields
  • Aggregate and visualize metrics
  • Works well with log management tools (ELK, Splunk, Datadog)

Logging Best Practices

Do's

  • ✅ Use meaningful, descriptive messages
  • ✅ Include context (user ID, request ID, trace ID)
  • ✅ Log at appropriate levels
  • ✅ Use parameterized logging (avoid string concatenation)
  • ✅ Include timestamps and service name
  • ✅ Log exceptions with full stack traces
  • ✅ Use correlation IDs to trace requests across services

Don'ts

  • ❌ Don't log in loops without throttling
  • ❌ Don't log sensitive information
  • ❌ Don't use System.out.println() or printStackTrace()
  • ❌ Don't log too much (impacts performance)
  • ❌ Don't log too little (makes debugging impossible)

Metrics & Monitoring

Application Monitoring

Monitoring provides real-time visibility into your application's health, performance, and behavior

Why Monitor?

  • Detect Issues Early - Catch problems before users report them
  • Understand Performance - Track response times, throughput, error rates
  • Capacity Planning - Know when to scale resources
  • SLA Compliance - Ensure you meet service level agreements
  • Troubleshooting - Diagnose performance bottlenecks

Observability Three Pillars

  1. Logs - What happened (events)
  2. Metrics - How much/how many (measurements)
  3. Traces - Request flow across services

💡 ️ While logs tell you what happened in the past, metrics and monitoring tell you what's happening right now.
They help you detect issues before they impact users and understand system behavior over time.

Spring Boot Actuator

Spring Boot Actuator provides production-ready features for monitoring and managing your application

Adding Actuator

// build.gradle.kts
dependencies {
    implementation("org.springframework.boot:spring-boot-starter-actuator")
    implementation("io.micrometer:micrometer-registry-prometheus")
}

Configuration

# application.yml
management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
  endpoint:
    health:
      show-details: always
  metrics:
    tags:
      application: ${spring.application.name}

Built-in Endpoints

  • /actuator/health - Application health status
  • /actuator/info - Application information
  • /actuator/metrics - Application metrics
  • /actuator/prometheus - Prometheus-formatted metrics
  • /actuator/env - Environment properties
  • /actuator/loggers - Logger configuration

Monitoring Tools & Platforms

Popular tools for collecting, storing, and visualizing metrics

Popular Monitoring Tools

  • Metrics Collection & Storage
    • Prometheus - Open-source metrics collection and time-series database
    • Micrometer - Vendor-neutral metrics facade for Spring Boot
    • InfluxDB - Time-series database optimized for metrics
  • Visualization & Dashboards
    • Grafana - Create beautiful dashboards from metrics data
    • Kibana - Visualize logs and metrics from Elasticsearch
  • All-in-One Platforms
    • Datadog - Full-stack observability platform
    • New Relic - Application performance monitoring (APM)
    • Dynatrace - AI-powered observability
    • Elastic Stack (ELK) - Elasticsearch, Logstash, Kibana for logs and metrics

Key Metrics to Monitor

  • Request rate and response time
  • Error rate and types
  • CPU and memory usage
  • Database query performance
  • External API latency
  • Queue length and processing time

Observability and Monitoring

Observability

Observability is the ability to measure the internal states of a system by examining its outputs.

The three pillars of observability are:

  • Logs - Records of events that happened in your application
  • Metrics - Numerical data about your application's performance
  • Traces - Records of the path taken by requests through your application

These help you understand what is happening in your production systems and debug issues when they occur.

Monitoring

Monitoring is the practice of collecting, processing, and analyzing observability data to understand system health.

Key aspects of monitoring include:

  • Setting up dashboards to visualize system health
  • Creating alerts for when things go wrong
  • Tracking SLIs (Service Level Indicators) and SLOs (Service Level Objectives)
  • Performance monitoring and capacity planning

Good monitoring helps you catch issues before they affect users and helps you make data-driven decisions about your system.

Developing Resilient Applications

Developing Resilient Applications

Building robust backends means expecting failure and responding clearly, consistently, and safely.
  • Validate Early
    Use require, check, and null checks to catch errors before they propagate.

  • Fail Fast, Fail Loud
    Throw descriptive exceptions when invariants are broken.

  • Define Custom Exceptions
    Clarify intent and aid in debugging and response formatting.

  • Handle Exceptions Globally
    Centralize response behavior with StatusPages for consistency.

  • Log Errors with Context
    Include trace IDs and timestamps to help trace and debug issues.

  • Use Standard Error Responses
    Clients benefit from consistent, parseable error formats.

  • Test Failure Scenarios
    Robust apps are not those that never fail, but those that fail gracefully.

❗ ️Remember: Resilient applications expect the unexpected and handle errors gracefully while providing clear feedback to both users and developers.

Practice