Deployment

Deploying Axum applications requires attention to performance, reliability, and security considerations. This guide covers production deployment strategies.

Running with Hyper

Axum uses Hyper under the hood. The standard way to run an Axum app:

use axum::Router;

#[tokio::main]
async fn main() {
    let app = Router::new();

    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000")
        .await
        .unwrap();
    
    println!("Listening on {}", listener.local_addr().unwrap());
    axum::serve(listener, app).await;
}

Binding to 0.0.0.0 makes your server accessible from outside the host machine. Use 127.0.0.1 for local-only access.

Graceful shutdown

Handle shutdown signals gracefully to avoid dropping in-flight requests:

Basic graceful shutdown

use axum::Router;
use tokio::net::TcpListener;
use tokio::signal;

#[tokio::main]
async fn main() {
    let app = Router::new();
    let listener = TcpListener::bind("0.0.0.0:3000").await.unwrap();

    axum::serve(listener, app)
        .with_graceful_shutdown(shutdown_signal())
        .await;
}

async fn shutdown_signal() {
    let ctrl_c = async {
        signal::ctrl_c()
            .await
            .expect("failed to install Ctrl+C handler");
    };

    #[cfg(unix)]
    let terminate = async {
        signal::unix::signal(signal::unix::SignalKind::terminate())
            .expect("failed to install signal handler")
            .recv()
            .await;
    };

    #[cfg(not(unix))]
    let terminate = std::future::pending::<()>();

    tokio::select! {
        _ = ctrl_c => {},
        _ = terminate => {},
    }

    println!("Shutdown signal received, starting graceful shutdown");
}

Handle SIGTERM and SIGINT

Listen for shutdown signals from the operating system or container orchestrator.

Stop accepting new requests

The server stops accepting new connections but continues processing existing ones.

Wait for in-flight requests

Allow existing requests to complete normally.

Clean shutdown

Close all connections and exit gracefully.

With timeout protection

Prevent requests from hanging during shutdown:

use axum::{
    http::StatusCode,
    routing::get,
    Router,
};
use tower_http::timeout::TimeoutLayer;
use std::time::Duration;

#[tokio::main]
async fn main() {
    let app = Router::new()
        .route("/slow", get(|| async {
            tokio::time::sleep(Duration::from_secs(5)).await;
            "Done"
        }))
        .layer(
            // Requests timeout after 10 seconds
            TimeoutLayer::with_status_code(
                StatusCode::REQUEST_TIMEOUT,
                Duration::from_secs(10),
            ),
        );

    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000")
        .await
        .unwrap();

    axum::serve(listener, app)
        .with_graceful_shutdown(shutdown_signal())
        .await;
}

Advanced serving with Hyper

For more control over the HTTP server, use Hyper’s low-level API:

use axum::{
    extract::Request,
    Router,
};
use hyper::body::Incoming;
use hyper_util::rt::{TokioExecutor, TokioIo};
use hyper_util::server;
use tokio::net::TcpListener;
use tower::Service;
use std::net::SocketAddr;

#[tokio::main]
async fn main() {
    let app = Router::new();
    let listener = TcpListener::bind("0.0.0.0:3000").await.unwrap();

    loop {
        let (socket, _remote_addr) = listener.accept().await.unwrap();
        let tower_service = app.clone();

        tokio::spawn(async move {
            let socket = TokioIo::new(socket);

            let hyper_service = hyper::service::service_fn(
                move |request: Request<Incoming>| {
                    tower_service.clone().call(request)
                }
            );

            if let Err(err) = server::conn::auto::Builder::new(TokioExecutor::new())
                .serve_connection_with_upgrades(socket, hyper_service)
                .await
            {
                eprintln!("Failed to serve connection: {err:#}");
            }
        });
    }
}

When to use low-level Hyper API

Custom connection handling logic
Per-connection state or middleware
Fine-grained control over HTTP/1.1 vs HTTP/2
Custom TLS configuration
Performance optimization for specific use cases

Configuration management

Use environment variables and configuration files:

use axum::Router;
use serde::Deserialize;
use std::net::SocketAddr;

#[derive(Deserialize)]
struct Config {
    #[serde(default = "default_host")]
    host: String,
    #[serde(default = "default_port")]
    port: u16,
    #[serde(default = "default_workers")]
    workers: usize,
}

fn default_host() -> String {
    "0.0.0.0".to_string()
}

fn default_port() -> u16 {
    3000
}

fn default_workers() -> usize {
    num_cpus::get()
}

impl Config {
    fn from_env() -> Result<Self, config::ConfigError> {
        config::Config::builder()
            .add_source(config::Environment::default())
            .build()?
            .try_deserialize()
    }
}

#[tokio::main]
async fn main() {
    let config = Config::from_env().unwrap_or_else(|_| Config {
        host: default_host(),
        port: default_port(),
        workers: default_workers(),
    });

    let addr: SocketAddr = format!("{}:{}", config.host, config.port)
        .parse()
        .expect("Invalid address");

    let app = Router::new();
    let listener = tokio::net::TcpListener::bind(addr).await.unwrap();
    
    println!("Server running on {addr}");
    axum::serve(listener, app).await;
}

Logging and observability

Implement comprehensive logging for production:

use tracing_subscriber::{
    layer::SubscriberExt,
    util::SubscriberInitExt,
    EnvFilter,
};

#[tokio::main]
async fn main() {
    tracing_subscriber::registry()
        .with(
            EnvFilter::try_from_default_env()
                .unwrap_or_else(|_| "info,tower_http=debug".into()),
        )
        .with(tracing_subscriber::fmt::layer().json())
        .init();

    // Your app setup...
}

Performance optimization

Connection pooling

Use connection pools for databases:

use sqlx::postgres::PgPoolOptions;
use std::time::Duration;

#[tokio::main]
async fn main() {
    let pool = PgPoolOptions::new()
        .max_connections(100)
        .acquire_timeout(Duration::from_secs(3))
        .connect("postgres://user:pass@localhost/db")
        .await
        .expect("Failed to create pool");

    let app = Router::new()
        .route("/users", get(list_users))
        .with_state(pool);

    // Server setup...
}

Response compression

use tower_http::compression::CompressionLayer;

let app = Router::new()
    .route("/api/data", get(large_response))
    .layer(CompressionLayer::new());

Body size limits

use tower_http::limit::RequestBodyLimitLayer;

let app = Router::new()
    .layer(RequestBodyLimitLayer::new(10 * 1024 * 1024)) // 10 MB
    .route("/upload", post(upload_handler));

Docker deployment

Create an optimized Dockerfile:

# Build stage
FROM rust:1.75 as builder
WORKDIR /app

# Copy manifests
COPY Cargo.toml Cargo.lock ./

# Build dependencies (cached layer)
RUN mkdir src && echo "fn main() {}" > src/main.rs
RUN cargo build --release
RUN rm -rf src

# Copy source and build
COPY src ./src
RUN touch src/main.rs
RUN cargo build --release

# Runtime stage
FROM debian:bookworm-slim

# Install runtime dependencies
RUN apt-get update && apt-get install -y \
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# Copy binary from builder
COPY --from=builder /app/target/release/myapp /usr/local/bin/myapp

# Create non-root user
RUN useradd -ms /bin/bash appuser
USER appuser

EXPOSE 3000

CMD ["myapp"]

Multi-stage builds keep your final image small by excluding build dependencies.

Kubernetes deployment

Example Kubernetes manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: axum-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: axum-app
  template:
    metadata:
      labels:
        app: axum-app
    spec:
      containers:
      - name: axum-app
        image: myregistry/axum-app:latest
        ports:
        - containerPort: 3000
        env:
        - name: HOST
          value: "0.0.0.0"
        - name: PORT
          value: "3000"
        - name: RUST_LOG
          value: "info"
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
  name: axum-app
spec:
  selector:
    app: axum-app
  ports:
  - port: 80
    targetPort: 3000
  type: LoadBalancer

Health checks

Implement health and readiness endpoints:

use axum::{
    extract::State,
    http::StatusCode,
    routing::get,
    Json, Router,
};
use serde_json::json;

#[derive(Clone)]
struct AppState {
    db_pool: sqlx::PgPool,
}

async fn health() -> StatusCode {
    StatusCode::OK
}

async fn readiness(State(state): State<AppState>) -> (StatusCode, Json<serde_json::Value>) {
    match sqlx::query("SELECT 1")
        .fetch_one(&state.db_pool)
        .await
    {
        Ok(_) => (StatusCode::OK, Json(json!({ "status": "ready" }))),
        Err(_) => (
            StatusCode::SERVICE_UNAVAILABLE,
            Json(json!({ "status": "not ready" })),
        ),
    }
}

let app = Router::new()
    .route("/health", get(health))
    .route("/ready", get(readiness))
    .with_state(app_state);

Security best practices

Always validate and sanitize user input. Never trust data from external sources.

HTTPS/TLS

For TLS termination, use a reverse proxy (nginx, Caddy) or cloud load balancer rather than handling TLS in your Axum app.

Security headers

use tower_http::set_header::SetResponseHeaderLayer;
use http::header;

let app = Router::new()
    .layer(SetResponseHeaderLayer::if_not_present(
        header::X_CONTENT_TYPE_OPTIONS,
        HeaderValue::from_static("nosniff"),
    ))
    .layer(SetResponseHeaderLayer::if_not_present(
        header::X_FRAME_OPTIONS,
        HeaderValue::from_static("DENY"),
    ));

Rate limiting

use tower_governor::{
    governor::GovernorConfigBuilder,
    GovernorLayer,
};

let governor_conf = Box::new(
    GovernorConfigBuilder::default()
        .per_second(10)
        .burst_size(20)
        .finish()
        .unwrap(),
);

let app = Router::new()
    .route("/api", get(handler))
    .layer(GovernorLayer { config: governor_conf });

Monitoring and metrics

use axum::routing::get;
use metrics_exporter_prometheus::{Matcher, PrometheusBuilder, PrometheusHandle};

fn setup_metrics() -> PrometheusHandle {
    PrometheusBuilder::new()
        .set_buckets_for_metric(
            Matcher::Full("http_request_duration_seconds".to_string()),
            &[0.001, 0.01, 0.1, 1.0, 10.0],
        )
        .unwrap()
        .install_recorder()
        .unwrap()
}

#[tokio::main]
async fn main() {
    let recorder_handle = setup_metrics();

    let app = Router::new()
        .route("/metrics", get(|| async move {
            recorder_handle.render()
        }));

    // Server setup...
}

Monitor key metrics: request rate, error rate, response time (latency), and saturation (resource usage).

Get Started

Core Concepts

Guides

Running with Hyper

Graceful shutdown

Basic graceful shutdown

With timeout protection

Advanced serving with Hyper

Configuration management

Logging and observability

Performance optimization

Connection pooling

Response compression

Body size limits

Docker deployment

Kubernetes deployment

Health checks

Security best practices

HTTPS/TLS

Security headers

Rate limiting

Monitoring and metrics

Get Started

Core Concepts

Guides

​Running with Hyper

​Graceful shutdown

​Basic graceful shutdown

​With timeout protection

​Advanced serving with Hyper

​Configuration management

​Logging and observability

​Performance optimization

​Connection pooling

​Response compression

​Body size limits

​Docker deployment

​Kubernetes deployment

​Health checks

​Security best practices

​HTTPS/TLS

​Security headers

​Rate limiting

​Monitoring and metrics

Running with Hyper

Graceful shutdown

Basic graceful shutdown

With timeout protection

Advanced serving with Hyper

Configuration management

Logging and observability

Performance optimization

Connection pooling

Response compression

Body size limits

Docker deployment

Kubernetes deployment

Health checks

Security best practices

HTTPS/TLS

Security headers

Rate limiting

Monitoring and metrics