Skip to content

Latest commit

 

History

History
158 lines (116 loc) · 10 KB

File metadata and controls

158 lines (116 loc) · 10 KB

Observability

The operator ships with built-in support for metrics, alerting, distributed tracing, and structured logging. These features constitute the operator's Level IV (Deep Insights) capabilities.

Metrics

Metrics are served via the standard controller-runtime Prometheus endpoint on :8443 over HTTPS by default (with authentication and authorization via --metrics-secure). To expose an insecure HTTP endpoint for local development, set --metrics-bind-address=:8080 --metrics-secure=false.

The operator exposes two classes of metrics:

Framework Metrics (provided automatically by controller-runtime):

  • controller_runtime_reconcile_total — total reconcile count per controller
  • controller_runtime_reconcile_errors_total — reconcile error rate
  • controller_runtime_reconcile_time_seconds — reconcile latency histogram
  • workqueue_depth — work queue backlog

Operator-Specific Metrics:

Metric Type Labels Description
multigres_operator_cluster_info Gauge name, namespace, phase Cluster phase tracking (always 1)
multigres_operator_cluster_cells_total Gauge cluster, namespace Cell count
multigres_operator_cluster_shards_total Gauge cluster, namespace Shard count
multigres_operator_cell_gateway_replicas Gauge cell, namespace, state Gateway ready/desired replicas
multigres_operator_shard_pool_replicas Gauge cluster, shard, pool, cell, namespace, state Pool ready/desired replicas
multigres_operator_pool_pods_drifted Gauge cluster, shard, pool, cell, namespace Pods with spec-hash mismatch requiring rolling update
multigres_operator_toposerver_replicas Gauge name, namespace, state TopoServer ready/desired replicas
multigres_operator_webhook_request_total Counter operation, resource, result Webhook admission request count
multigres_operator_webhook_request_duration_seconds Histogram operation, resource Webhook latency
multigres_operator_last_backup_age_seconds Gauge cluster, shard, namespace Age of most recent completed backup (seconds)
multigres_operator_drain_operations_total Counter cluster, shard, result Total graceful pod drain operations
multigres_operator_rolling_update_in_progress Gauge cluster, shard, pool, cell, namespace Whether a rolling update is in progress for a pool

Alerts

Pre-configured PrometheusRule alerts are provided in config/monitoring/prometheus-rules.yaml. Apply them to a Prometheus Operator installation:

kubectl apply -f config/monitoring/prometheus-rules.yaml
Alert Severity Fires When
MultigresClusterReconcileErrors warning Sustained non-zero reconcile error rate (5m)
MultigresClusterDegraded warning Cluster phase ≠ "Healthy" for >10m
MultigresCellGatewayUnavailable critical Zero ready gateway replicas in a cell (5m)
MultigresShardPoolDegraded warning Ready < desired replicas for >10m
MultigresWebhookErrors warning Webhook returning errors (5m)
MultigresReconcileSlow warning p99 reconcile latency >30s (5m)
MultigresControllerSaturated warning Work queue depth >50 for >10m
MultigresBackupStale warning Most recent backup older than 24 hours (30m)
MultigresRollingUpdateStuck warning Rolling update in progress for >30m
MultigresDrainTimeout warning Drain operations timing out (10m)

Each alert links to a dedicated runbook with investigation steps, PromQL queries, and remediation actions.

Grafana Dashboards

Three Grafana dashboards are included in config/monitoring/:

  • Operator Dashboard (grafana-dashboard-operator.json) — reconcile rates, error rates, latencies, queue depth, and webhook performance.
  • Cluster Dashboard (grafana-dashboard-cluster.json) — per-cluster topology (cells, shards), replica health, and phase tracking.
  • Data Plane Dashboard (grafana-dashboard-data-plane.json) — pool pod drift, rolling update progress, drain operations, and backup age.

Import via the Grafana dashboards ConfigMap (generated by config/monitoring/kustomization.yaml):

kubectl apply -k config/monitoring/

Local Development

For local development, the observability overlay in config/deploy-observability/ deploys the OTel Collector, Prometheus (via the Prometheus Operator), Tempo, and Grafana as separate pods. Both dashboards and datasources are pre-provisioned.

make kind-deploy-observability
make kind-portforward

This deploys the operator with tracing enabled and opens port-forwards to:

Service URL
Grafana http://localhost:3000
Prometheus http://localhost:9090
Tempo http://localhost:3200

Metrics collection: The operator and data-plane components use different metric collection models:

Component Metric Model How it works
Operator Pull (Prometheus scrape) Prometheus scrapes the operator's /metrics endpoint via controller-runtime's built-in Prometheus integration
Data plane runtimes (multiorch, multipooler, multigateway, etc.) Push (OTLP) Multigres binaries push metrics via OpenTelemetry to the configured OTLP endpoint
Postgres engine metrics (postgres_exporter sidecar on shard pool pods) Pull (Prometheus scrape) Prometheus scrapes the metrics port on shard-pool headless Services via ServiceMonitor

The OTel Collector receives all pushed OTLP signals from the data plane and routes them: traces → Tempo, metrics → Prometheus (via its OTLP receiver). This is necessary because multigres components send all signals to a single OTLP endpoint and cannot split them by signal type.

Distributed Tracing

The operator supports OpenTelemetry distributed tracing via OTLP. Tracing is disabled by default and incurs zero overhead when off.

Enabling tracing: Set a single environment variable on the operator Deployment:

env:
  - name: OTEL_EXPORTER_OTLP_ENDPOINT
    value: "http://otel-collector.monitoring.svc:4318"  # OTel Collector or Tempo

The endpoint must speak OTLP (HTTP or gRPC) — this can be an OpenTelemetry Collector, Grafana Tempo, Jaeger, or any compatible backend.

What gets traced:

  • Every controller reconciliation (MultigresCluster, Cell, Shard, TableGroup, TopoServer)
  • Sub-operations within a reconcile (ReconcileCells, UpdateStatus, PopulateDefaults, etc.)
  • Webhook admission handling (defaulting and validation)
  • Webhook-to-reconcile trace propagation: the defaulter webhook injects a trace context into cluster annotations so the first reconciliation appears as a child span of the webhook trace

Additional OTel configuration: The operator respects all standard OTel environment variables including OTEL_TRACES_SAMPLER, OTEL_EXPORTER_OTLP_INSECURE, and OTEL_SERVICE_NAME.

Custom sampler configuration: When spec.observability.samplingConfigRef references a ConfigMap, the operator propagates OTEL_TRACES_SAMPLER_CONFIG to all data-plane containers (pool pods, multigateway, multiadmin, multiorch) and mounts the ConfigMap as a volume. This is required when using custom samplers like multigres_custom that read configuration from a file. Without samplingConfigRef, the sampler defaults to the standard OTel environment variable behavior.

Structured Logging

The operator uses structured JSON logging (zap via controller-runtime). When tracing is enabled, every log line within a traced operation automatically includes trace_id and span_id fields, enabling log-trace correlation — click a log line in Grafana Loki to jump directly to the associated trace.

Log level configuration: The operator accepts standard controller-runtime zap flags on its command line:

Flag Default Description
--zap-devel true Development mode preset (see table below)
--zap-log-level depends on mode Log verbosity: debug, info, error, or an integer (0=debug, 1=info, 2=error)
--zap-encoder depends on mode Log format: console or json
--zap-stacktrace-level depends on mode Minimum level that triggers stacktraces

--zap-devel is a mode that sets multiple defaults at once. --zap-log-level overrides the mode's default level when specified explicitly:

Setting --zap-devel=true (default) --zap-devel=false (production)
Default log level debug info
Encoder console (human-readable) json
Stacktraces from warn error

To change the log level in a deployed operator, add args to the manager container:

spec:
  template:
    spec:
      containers:
        - name: manager
          args:
            - --zap-devel=false       # Production mode (JSON, info level default)
            - --zap-log-level=info    # Explicit level (overrides mode default)

Note

The default build ships with Development: true, which sets the default level to debug and uses the human-readable console encoder. For production deployments, set --zap-devel=false to switch to JSON encoding and info-level logging.

For a hands-on tutorial of the full observability stack, see the Observability Demo.