Monitoring with Grafana and Prometheus — How to Monitor Your Server in Real Time
Imagine this scenario: a client calls you on Monday morning and says their website hasn’t been working since Saturday at midnight. You have no idea when the problem occurred, which service went down, or what the cause was. An hour of debugging later, you discover the disk was full — a problem that’s easy to fix, but only when you know it exists.
This scenario happens daily to companies without proper monitoring in place. Grafana and Prometheus are the industry standard for infrastructure monitoring that makes such situations a thing of the past.
What Is Monitoring and Why Is It Critical?
Monitoring is the continuous collection, storage, and analysis of system metrics — CPU, RAM, disk, network, application response times, error counts, and hundreds of other parameters.
Without monitoring you work reactively — waiting for a problem to escalate to the point where a client or user reports it. With monitoring you work proactively — the system warns you before the problem becomes visible to users.
Concrete benefits of monitoring:
Detecting problems before users do. An alert system wakes you at 3 AM when CPU exceeds 90% — not a client who at 9 AM can’t open the application.
Capacity planning. You see the trend of disk space growth and know that in 3 months you need to add storage — you don’t wait for the system to crash from a full disk.
Performance optimization. You identify which database queries consume the most resources and optimize them.
Debugging. When a problem occurs, you have historical data showing exactly what was happening in the system at the moment of the incident.
SLA tracking. You prove to clients the system uptime with precise data, not estimates.
Grafana and Prometheus — Who Does What?
Prometheus and Grafana are almost always mentioned together, but they are two separate tools with different roles.
Prometheus — Metrics Collection and Storage
Prometheus is an open-source monitoring system originally developed by SoundCloud in 2012, now maintained by the Cloud Native Computing Foundation (CNCF) — the same organization behind Kubernetes.
Prometheus works on a pull model — it pulls metrics from target systems at defined intervals (typically every 15-60 seconds). Metrics are stored in a time-series database.
Each metric in Prometheus has:
- Name (e.g.,
node_cpu_seconds_total) - Labels — key-value pairs that enable filtering (e.g.,
{instance="server1", job="node"}) - Value — numeric value at a given moment
- Timestamp — time of sampling
For searching and aggregating metrics, Prometheus uses its own query language PromQL (Prometheus Query Language).
Grafana — Data Visualization
Grafana is an open-source data visualization platform. It doesn’t collect metrics itself — it connects to data sources (Prometheus, InfluxDB, Elasticsearch, SQL databases, and many others) and displays them as charts, tables, heatmaps, and other visualizations.
Grafana dashboards are a powerful tool — on a single screen you can see everything that matters: status of all services, system load, application response times, error count in the last hour.
Monitoring Stack Architecture
A typical Prometheus + Grafana setup looks like this:
[Servers/Applications] → [Exporters] → [Prometheus] → [Grafana]
↓
[Alertmanager] → [Email/Slack/PagerDuty]
Exporters are small programs that “translate” system metrics into a format Prometheus understands. The most important exporters:
- Node Exporter — system metrics (CPU, RAM, disk, network) for Linux servers
- MySQL Exporter — MySQL/MariaDB database metrics
- Nginx Exporter — Nginx web server metrics
- Blackbox Exporter — URL availability checks (HTTP, HTTPS, DNS, TCP)
- cAdvisor — Docker container metrics
Alertmanager manages alerts — deduplicates, groups, and routes notifications to email, Slack, PagerDuty, and other channels.
Installation on Hetzner Server
We’ll show you how to set up a complete monitoring stack on a Hetzner VPS using Docker Compose — the fastest and easiest method.
Prerequisites
- Hetzner VPS with Ubuntu 22.04 (CX22 or larger)
- Docker and Docker Compose installed
- Open ports: 3000 (Grafana), 9090 (Prometheus)
File Structure
/opt/monitoring/
├── docker-compose.yml
├── prometheus/
│ └── prometheus.yml
└── grafana/
└── datasources/
└── prometheus.yml
Docker Compose Configuration
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention.time=30d'
ports:
- "9090:9090"
grafana:
image: grafana/grafana:latest
container_name: grafana
restart: unless-stopped
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/datasources:/etc/grafana/provisioning/datasources
environment:
- GF_SECURITY_ADMIN_PASSWORD=your_password
- GF_USERS_ALLOW_SIGN_UP=false
ports:
- "3000:3000"
depends_on:
- prometheus
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|host|etc)($|/)'
expose:
- 9100
alertmanager:
image: prom/alertmanager:latest
container_name: alertmanager
restart: unless-stopped
volumes:
- ./alertmanager:/etc/alertmanager
ports:
- "9093:9093"
volumes:
prometheus_data:
grafana_data:
Prometheus Configuration
# prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
labels:
instance: 'hetzner-main'
# Add other servers
- job_name: 'node-server2'
static_configs:
- targets: ['192.168.1.2:9100']
labels:
instance: 'server2'
Starting the Stack
cd /opt/monitoring
docker-compose up -d
Grafana is available at http://your-ip:3000 (admin / your_password).
Setting Up Grafana Dashboards
After first login to Grafana:
1. Add Prometheus as a data source: Configuration → Data Sources → Add data source → Prometheus → URL: http://prometheus:9090
2. Import ready-made dashboards: Grafana has a rich library of ready-made dashboards at grafana.com/grafana/dashboards.
Most useful dashboards:
- Node Exporter Full (ID: 1860) — complete Linux server overview
- Docker and System Monitoring (ID: 893) — Docker container monitoring
- Nginx (ID: 9614) — Nginx web server metrics
- MySQL Overview (ID: 7362) — MySQL database monitoring
Import: Dashboards → Import → enter ID → Load → Import
Useful PromQL Queries
PromQL is a powerful language enabling complex analyses. Here are the most useful queries:
CPU usage per server:
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
Available RAM (MB):
node_memory_MemAvailable_bytes / 1024 / 1024
Disk usage (%):
100 - ((node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100)
Network traffic (MB/s):
rate(node_network_receive_bytes_total{device="eth0"}[5m]) / 1024 / 1024
Server uptime:
node_time_seconds - node_boot_time_seconds
Setting Up Alerts
Alerts are perhaps the most important part of a monitoring system. Without alerts, the data you collect is only useful for retrospective analysis — not proactive response.
Example alert configuration:
# prometheus/alerts.yml
groups:
- name: server_alerts
rules:
- alert: HighCPUUsage
expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU load on {{ $labels.instance }}"
description: "CPU has been at {{ $value }}% for 5 minutes"
- alert: DiskSpaceLow
expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 10
for: 1m
labels:
severity: critical
annotations:
summary: "Low disk space — {{ $labels.instance }}"
description: "Only {{ $value }}% free space remaining"
- alert: ServerDown
expr: up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Server unreachable: {{ $labels.instance }}"
Monitoring Multiple Servers
One of the greatest advantages of the Prometheus + Grafana stack is centralized monitoring of all servers from one place.
For each new server:
- Install Node Exporter:
docker run -d \
--name node-exporter \
--restart unless-stopped \
-p 9100:9100 \
-v /proc:/host/proc:ro \
-v /sys:/host/sys:ro \
prom/node-exporter
- Add server to
prometheus.yml:
- job_name: 'node-new-server'
static_configs:
- targets: ['NEW_SERVER_IP:9100']
labels:
instance: 'server-name'
- Reload Prometheus configuration:
curl -X POST http://localhost:9090/-/reload
Conclusion
Prometheus and Grafana are the de facto standard for monitoring modern IT infrastructure. Implementation is relatively straightforward with Docker Compose, and the value they bring is enormous — fewer unplanned outages, faster problem resolution, and better insight into system health.
For small and medium businesses, monitoring isn’t a luxury — it’s the foundation of professional IT management. The difference between a company that hears about a problem from a client and one that resolves it before the client notices is exactly this kind of system.
DevTet implements and maintains Grafana/Prometheus monitoring as part of our managed IT services. If you don’t want to deal with configuration and maintenance, contact us and we’ll set up a complete monitoring stack for your infrastructure.
DevTet provides managed IT services, monitoring, and cloud infrastructure for small and medium businesses. See our services.
DevTet Free Tools
Try our free online tools — no registration required
QR codes, image compression, background removal, PDF tools and more. All free, all browser-based, no file uploads to third-party servers.
Browse Free Tools →