Adding Monitoring to a Kubernetes Cluster with kube-prometheus

There is a difference between installing monitoring and actually making it useful for an operations team.

This note leaned more toward the second half: not just getting Prometheus into the cluster, but wiring the alerting and service exposure in a way that makes the stack usable outside the cluster too.

1. Install the Base Stack

The initial deployment was the usual kube-prometheus flow:

1
2
3
4
5
6
git clone https://github.com/prometheus-operator/kube-prometheus.git
cd kube-prometheus

kubectl apply --server-side -f manifests/setup
kubectl wait --for condition=Established --all CustomResourceDefinition --namespace=monitoring
kubectl apply -f manifests/

Then the monitoring services were exposed:

1
2
kubectl patch svc -n monitoring prometheus-k8s -p '{"spec":{"type":"LoadBalancer"}}'
kubectl patch svc -n monitoring grafana -p '{"spec":{"type":"LoadBalancer"}}'

2. Add Alertmanager Configuration

The note also included the operational step that often gets delayed: actually wiring alert delivery.

The generalized shape looked like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: alerts-config
  namespace: monitoring
spec:
  receivers:
    - name: slack
      slackConfigs:
        - apiURL:
            key: webhook
            name: alerts-secret
          channel: "#alerts"
          sendResolved: true
  route:
    receiver: slack

The original note contained a live webhook secret and internal channel details, so those are intentionally replaced here.

3. Extend the Prometheus RBAC

Like the other monitoring note in this batch, this one needed more permissions than the default cluster role had.

The useful part to preserve is the pattern:

check what Prometheus is trying to scrape
compare that to the current role
add the missing get, list, and sometimes watch verbs for the relevant resources

That is not glamorous work, but it is the sort of thing you end up doing in real clusters.

4. Treat Monitoring as a Real Service

The practical lesson from this note is that monitoring is not “done” when the pods are Running.

It is done when:

Prometheus can actually see the resources you care about
Grafana is reachable where operators need it
Alertmanager is wired to something a human will actually see

That is a different definition of success, and it is usually the more useful one.

1. Install the Base Stack#

2. Add Alertmanager Configuration#

3. Extend the Prometheus RBAC#

4. Treat Monitoring as a Real Service#

1. Install the Base Stack

2. Add Alertmanager Configuration

3. Extend the Prometheus RBAC

4. Treat Monitoring as a Real Service