
Ceph Administration – Master Open Source Distributed Storage
Learn to deploy and manage Ceph storage clusters to build reliable, high-performance, and scalable distributed storage infrastructure. Hands-on 3-day training covering RADOS architecture, core components, and basic administration.
Training objectives
Upon completion of this training, you will be able to:
- Understand Ceph architecture and the RADOS (Reliable Autonomic Distributed Object Store) system
- Install and configure a functional Ceph cluster using Cephadm
- Manage core components: Monitors, OSDs, Managers, and MDS
- Implement block storage with RADOS Block Device (RBD)
- Deploy object storage via RADOS Gateway (RGW) with S3 compatibility
- Configure file storage using CephFS
- Administer pools and manage data replication
- Use the Ceph dashboard for graphical monitoring
- Diagnose and resolve common issues
- Apply best practices for performance and security
Target audience
This training is designed for:
System Administrators
Linux professionals looking to evolve into distributed storage administration and acquire Software-Defined Storage skills
Infrastructure Engineers
Responsible for deploying and managing scalable storage infrastructures for modern datacenters
Cloud Architects
Experts seeking a high-performance open source storage solution as an alternative to expensive proprietary solutions
OpenStack/Kubernetes Administrators
Professionals wanting to integrate Ceph as a storage backend for their cloud platforms
DevOps Engineers
Developers and operations staff looking to automate storage provisioning with Ansible
Training adapted for French-speaking African professionals seeking sovereign and cost-effective storage solutions.
Prerequisites
Technical Prerequisites
Required
- Linux Administration: Proficiency in basic commands, systemd service management, file editing
- Storage Concepts: Understanding of basic concepts (partitions, file systems, RAID)
- TCP/IP Networking: Fundamental knowledge (IP addressing, ports, protocols)
- Shell Scripting: Ability to read and understand simple Bash scripts
Recommended
- Experience with virtualization (KVM, VMware)
- Knowledge of distributed architectures
- Python basics (for automation scripts)
- Familiarity with high availability concepts
Required Hardware Configuration
For online hands-on labs, each participant must have:
- Stable internet connection (minimum 10 Mbps)
- Modern web browser (Chrome, Firefox, Edge)
- SSH client (Terminal, PuTTY, or equivalent)
- Lab environment provided in ECINTELLIGENCE cloud
Detailed program
Detailed Training Program
Day 1: Fundamentals and Ceph Architecture
Module 1: Introduction to Distributed Storage and Ceph (3h)
- Storage evolution: from DAS/NAS/SAN to Software-Defined Storage
- Traditional storage challenges and distributed storage advantages
- Ceph project history and philosophy
- RADOS architecture: the heart of Ceph
- Key concepts: pools, placement groups (PGs), CRUSH map
- Use cases and success stories in Africa
- Comparison with other solutions (GlusterFS, MinIO)
Exploring a demo Ceph cluster, analyzing the architecture
Module 2: Essential Ceph Components (4h)
- Monitors (MON): Cluster map and quorum management
- Role in cluster consistency
- Quorum configuration (3, 5, 7 monitors)
- Epoch and version management
- OSDs (Object Storage Daemons): Data storage
- Disk and partition management
- Replication and recovery process
- Journal and metadata
- Managers (MGR): Metrics and plugins
- Web dashboard
- Prometheus, Zabbix modules
- RESTful API
- MDS (Metadata Server): For CephFS
- Metadata cache
- Active/Standby configuration
Installing a 3-node Ceph cluster with Cephadm (Reef 18.2)
Day 2: Deployment and Storage Types
Module 3: Deployment with Cephadm (3h)
- System and network prerequisites for Ceph
- Cluster bootstrap with cephadm
- Adding nodes and OSDs
- Configuring public and cluster networks
- Service management with orchestrator
- Upgrading to Squid 19.2
- Deployment strategies: All-in-one vs distributed
Complete cluster deployment, adding nodes, network configuration
Module 4: Block Storage with RBD (2h)
- RBD architecture and operation
- Creating and managing RBD images
- Snapshots and clones
- Advanced features: layering, exclusive-lock
- Integration with hypervisors (KVM/QEMU)
- Mapping volumes on Linux clients
- Performance tuning for database workloads
Creating RBD images, snapshots, client mounting, performance testing
Module 5: Object Storage with RGW (2h)
- Object storage concepts: buckets and objects
- RADOS Gateway deployment
- Multi-site and zone configuration
- S3 and Swift APIs: compatibility and differences
- User and quota management
- Bucket policies and ACLs
- Use cases: archiving, backup, CDN
RGW deployment, bucket creation, testing with AWS CLI and s3cmd
Day 3: Advanced Administration and Optimization
Module 6: File Storage with CephFS (2h)
- CephFS architecture: MDS and data/metadata pools
- MDS deployment and configuration
- Mounting CephFS: kernel client vs FUSE
- Subvolume and snapshot management
- Quotas and layouts
- NFS and SMB export via Ganesha
- Use cases: file sharing, home directories
CephFS configuration, client mounting, load testing
Module 7: Administration and Monitoring (3h)
- Pool and CRUSH Management
- Pool types: replicated vs erasure coded
- CRUSH map modification
- Custom placement rules
- Dashboard and Monitoring
- Navigating the Ceph dashboard
- Key metrics: IOPS, throughput, latency
- Prometheus/Grafana integration
- Daily Maintenance
- Essential commands: ceph status, health, df
- OSD management: in/out, up/down
- Scrubbing and deep-scrub processes
Monitoring configuration, creating custom dashboards
Module 8: Performance and Troubleshooting (2h)
- Performance analysis with ceph bench
- Optimization: cache tiering, optimal placement
- Troubleshooting: slow requests, PG states
- Recovery and backfilling: management and priorities
- Practical failure scenarios and resolutions
- Backup and disaster recovery strategies
- Future developments: NVMe-oF, compression
Failure simulation and recovery, production cluster optimization
Certification and Assessment
- Knowledge validation quiz at the end of training
- ECINTELLIGENCE training certificate
- Complete course materials in English (250+ pages)
- Lab access for 30 days after training
- Optional preparation for Red Hat Ceph certification
- Integration into the French-speaking Africa Ceph community
Certification
At the end of this training, you will receive a certificate of participation issued by squint.
Other training courses that might interest you
Ready to develop your skills?
Join hundreds of professionals who have trusted squint for their skills.
View all our training courses