Ceph Administration – Master Open Source Distributed Storage
Online training
3 days (21 hours)

Ceph Administration – Master Open Source Distributed Storage

Learn to deploy and manage Ceph storage clusters to build reliable, high-performance, and scalable distributed storage infrastructure. Hands-on 3-day training covering RADOS architecture, core components, and basic administration.

Training objectives

Upon completion of this training, you will be able to:

  • Understand Ceph architecture and the RADOS (Reliable Autonomic Distributed Object Store) system
  • Install and configure a functional Ceph cluster using Cephadm
  • Manage core components: Monitors, OSDs, Managers, and MDS
  • Implement block storage with RADOS Block Device (RBD)
  • Deploy object storage via RADOS Gateway (RGW) with S3 compatibility
  • Configure file storage using CephFS
  • Administer pools and manage data replication
  • Use the Ceph dashboard for graphical monitoring
  • Diagnose and resolve common issues
  • Apply best practices for performance and security

Target audience

This training is designed for:

System Administrators

Linux professionals looking to evolve into distributed storage administration and acquire Software-Defined Storage skills

Infrastructure Engineers

Responsible for deploying and managing scalable storage infrastructures for modern datacenters

Cloud Architects

Experts seeking a high-performance open source storage solution as an alternative to expensive proprietary solutions

OpenStack/Kubernetes Administrators

Professionals wanting to integrate Ceph as a storage backend for their cloud platforms

DevOps Engineers

Developers and operations staff looking to automate storage provisioning with Ansible

Training adapted for French-speaking African professionals seeking sovereign and cost-effective storage solutions.

Prerequisites

Technical Prerequisites

Required

  • Linux Administration: Proficiency in basic commands, systemd service management, file editing
  • Storage Concepts: Understanding of basic concepts (partitions, file systems, RAID)
  • TCP/IP Networking: Fundamental knowledge (IP addressing, ports, protocols)
  • Shell Scripting: Ability to read and understand simple Bash scripts

Required Hardware Configuration

For online hands-on labs, each participant must have:

  • Stable internet connection (minimum 10 Mbps)
  • Modern web browser (Chrome, Firefox, Edge)
  • SSH client (Terminal, PuTTY, or equivalent)
  • Lab environment provided in ECINTELLIGENCE cloud

Detailed program

Detailed Training Program

Day 1: Fundamentals and Ceph Architecture

Module 1: Introduction to Distributed Storage and Ceph (3h)

  • Storage evolution: from DAS/NAS/SAN to Software-Defined Storage
  • Traditional storage challenges and distributed storage advantages
  • Ceph project history and philosophy
  • RADOS architecture: the heart of Ceph
  • Key concepts: pools, placement groups (PGs), CRUSH map
  • Use cases and success stories in Africa
  • Comparison with other solutions (GlusterFS, MinIO)
Hands-on Lab:

Exploring a demo Ceph cluster, analyzing the architecture

Module 2: Essential Ceph Components (4h)

  • Monitors (MON): Cluster map and quorum management
    • Role in cluster consistency
    • Quorum configuration (3, 5, 7 monitors)
    • Epoch and version management
  • OSDs (Object Storage Daemons): Data storage
    • Disk and partition management
    • Replication and recovery process
    • Journal and metadata
  • Managers (MGR): Metrics and plugins
    • Web dashboard
    • Prometheus, Zabbix modules
    • RESTful API
  • MDS (Metadata Server): For CephFS
    • Metadata cache
    • Active/Standby configuration
Hands-on Lab:

Installing a 3-node Ceph cluster with Cephadm (Reef 18.2)

Day 2: Deployment and Storage Types

Module 3: Deployment with Cephadm (3h)

  • System and network prerequisites for Ceph
  • Cluster bootstrap with cephadm
  • Adding nodes and OSDs
  • Configuring public and cluster networks
  • Service management with orchestrator
  • Upgrading to Squid 19.2
  • Deployment strategies: All-in-one vs distributed
Hands-on Lab:

Complete cluster deployment, adding nodes, network configuration

Module 4: Block Storage with RBD (2h)

  • RBD architecture and operation
  • Creating and managing RBD images
  • Snapshots and clones
  • Advanced features: layering, exclusive-lock
  • Integration with hypervisors (KVM/QEMU)
  • Mapping volumes on Linux clients
  • Performance tuning for database workloads
Hands-on Lab:

Creating RBD images, snapshots, client mounting, performance testing

Module 5: Object Storage with RGW (2h)

  • Object storage concepts: buckets and objects
  • RADOS Gateway deployment
  • Multi-site and zone configuration
  • S3 and Swift APIs: compatibility and differences
  • User and quota management
  • Bucket policies and ACLs
  • Use cases: archiving, backup, CDN
Hands-on Lab:

RGW deployment, bucket creation, testing with AWS CLI and s3cmd

Day 3: Advanced Administration and Optimization

Module 6: File Storage with CephFS (2h)

  • CephFS architecture: MDS and data/metadata pools
  • MDS deployment and configuration
  • Mounting CephFS: kernel client vs FUSE
  • Subvolume and snapshot management
  • Quotas and layouts
  • NFS and SMB export via Ganesha
  • Use cases: file sharing, home directories
Hands-on Lab:

CephFS configuration, client mounting, load testing

Module 7: Administration and Monitoring (3h)

  • Pool and CRUSH Management
    • Pool types: replicated vs erasure coded
    • CRUSH map modification
    • Custom placement rules
  • Dashboard and Monitoring
    • Navigating the Ceph dashboard
    • Key metrics: IOPS, throughput, latency
    • Prometheus/Grafana integration
  • Daily Maintenance
    • Essential commands: ceph status, health, df
    • OSD management: in/out, up/down
    • Scrubbing and deep-scrub processes
Hands-on Lab:

Monitoring configuration, creating custom dashboards

Module 8: Performance and Troubleshooting (2h)

  • Performance analysis with ceph bench
  • Optimization: cache tiering, optimal placement
  • Troubleshooting: slow requests, PG states
  • Recovery and backfilling: management and priorities
  • Practical failure scenarios and resolutions
  • Backup and disaster recovery strategies
  • Future developments: NVMe-oF, compression
Final Project:

Failure simulation and recovery, production cluster optimization

Certification and Assessment

  • Knowledge validation quiz at the end of training
  • ECINTELLIGENCE training certificate
  • Complete course materials in English (250+ pages)
  • Lab access for 30 days after training
  • Optional preparation for Red Hat Ceph certification
  • Integration into the French-speaking Africa Ceph community

Certification

At the end of this training, you will receive a certificate of participation issued by squint.

1680 EUR

per participant

Duration

3 days (21 hours)

Format

Online training

Next session

On request

Request a quote

Other training courses that might interest you

Ready to develop your skills?

Join hundreds of professionals who have trusted squint for their skills.

View all our training courses

Nathan

ECINTELLIGENCE virtual assistant