Job Throughput
- Jobs per second
- Success/failure rates
- Processing times
Comprehensive monitoring and observability for your Mend deployment.
Mend exposes Prometheus metrics for monitoring performance, tracking usage, and alerting on issues.
GET /metrics
Public No authentication required
curl http://localhost:8080/metrics| Metric | Type | Description |
|---|---|---|
mend_jobs_total | Counter | Total jobs processed by type and status |
mend_job_duration_seconds | Histogram | Job processing duration |
mend_jobs_in_progress | Gauge | Currently processing jobs |
| Metric | Type | Description |
|---|---|---|
mend_queue_depth | Gauge | Current queue depth by type |
mend_queue_jobs_enqueued_total | Counter | Total jobs enqueued |
mend_queue_jobs_dequeued_total | Counter | Total jobs dequeued |
| Metric | Type | Description |
|---|---|---|
mend_worker_utilization | Gauge | Worker utilization percentage |
mend_memory_usage_bytes | Gauge | Memory usage |
mend_disk_usage_bytes | Gauge | Disk usage |
Import the provided Grafana dashboard for visualization:
Job Throughput
Queue Health
System Resources
Example Prometheus alerting rules:
groups: - name: mend rules: - alert: HighQueueDepth expr: mend_queue_depth > 100 for: 5m annotations: summary: "Queue depth is high"
- alert: HighFailureRate expr: rate(mend_jobs_total{status="failed"}[5m]) > 0.1 for: 5m annotations: summary: "Job failure rate is high"