This note is really about preparation work. Before Ceph becomes interesting, there is a less glamorous part where you make sure the nodes, disks, network, and Kubernetes base layer are all predictable.
That is what this draft preserves.
1. Prepare the Target Nodes Carefully
The original note started with node-level prep:
- add the deployment user to
sudoers - verify which disk is the root disk
- wipe only the disks that will be used for storage
- clean
/etc/fstabso only the intended mounts remain
The disk wipe section was intentionally aggressive because it was meant for clean storage nodes:
| |
That is not a command sequence to copy casually. It is only appropriate when you are certain the disks are meant to become Ceph OSD media.
2. Make the Node Networking Explicit
The source note also captured the network work that tends to get skipped in polished tutorials:
- configure the expected interfaces in
netplan - apply the network config
- verify that the storage or high-speed link is actually up
That matters because Ceph problems often get blamed on Ceph when the real issue is the network underneath it.
3. Use DeepOps to Build the Kubernetes Base Layer
Once the nodes were reachable, the next phase was a DeepOps-based Kubernetes deployment:
| |
Then prepare the inventory and the k8s-cluster.yml settings so the control-plane addresses, Calico mode, and non-GPU behavior are correct for the environment.
The practical deployment command was:
| |
4. Add MetalLB for Bare-Metal Service Exposure
Once the base cluster was alive, the next useful layer was MetalLB:
| |
Then define an address pool and L2 advertisement:
| |
I have replaced the real network values with documentation-safe placeholders here.
5. Add Monitoring Early
One thing I like about the original note is that it did not stop at “the cluster deployed successfully.” It moved on to monitoring immediately.
That post-deployment flow was:
| |
That is a good habit on new clusters. If you wait too long to add observability, the first serious debugging session becomes harder than it needs to be.
Closing Thought
There are a lot of guides that start at the point where the storage software is already about to be installed. In real life, that is not where the work starts.
It starts with clean disks, predictable networking, reachable nodes, a usable inventory, and a cluster that is boring enough to trust.