# V2 Docker Monitoring Stack

## Übersicht

Die Monitoring-Lösung für V2 Docker basiert auf dem Prometheus-Stack und bietet umfassende Einblicke in die Performance und Gesundheit aller Services.

## Komponenten

### 1. **Prometheus** (Port 9090)
- Zentrale Metrik-Sammlung
- Konfigurierte Scrape-Jobs für alle Services
- 30 Tage Datenaufbewahrung
- Alert-Rules für kritische Ereignisse

### 2. **Grafana** (Port 3000)
- Visualisierung der Metriken
- Vorkonfigurierte Dashboards
- Alerting-Integration
- Standard-Login: admin/admin (beim ersten Login ändern)

### 3. **Alertmanager** (Port 9093)
- Alert-Routing und -Gruppierung
- Email-Benachrichtigungen
- Webhook-Integration
- Alert-Silencing und -Inhibition

### 4. **Exporters**
- **PostgreSQL Exporter**: Datenbank-Metriken
- **Redis Exporter**: Cache-Metriken
- **Node Exporter**: System-Metriken
- **Nginx Exporter**: Proxy-Metriken

## Installation

### 1. Monitoring-Stack starten

```bash
cd monitoring
docker-compose -f docker-compose.monitoring.yml up -d
```

### 2. Services überprüfen

```bash
docker-compose -f docker-compose.monitoring.yml ps
```

### 3. Grafana-Zugang

1. Öffnen Sie https://monitoring.v2-docker.com (oder http://localhost:3000)
2. Login mit admin/admin
3. Neues Passwort setzen
4. Dashboard "License Server Overview" öffnen

## Konfiguration

### Environment-Variablen

Erstellen Sie eine `.env` Datei im monitoring-Verzeichnis:

```env
# Grafana
GRAFANA_USER=admin
GRAFANA_PASSWORD=secure-password

# PostgreSQL Connection
POSTGRES_PASSWORD=your-postgres-password

# Alertmanager SMTP
SMTP_USERNAME=alerts@yourdomain.com
SMTP_PASSWORD=smtp-password

# Webhook URLs
WEBHOOK_CRITICAL=https://your-webhook-url/critical
WEBHOOK_SECURITY=https://your-webhook-url/security
```

### Alert-Konfiguration

Alerts sind in `prometheus/rules/license-server-alerts.yml` definiert:

- **HighLicenseValidationErrorRate**: Fehlerrate > 5%
- **PossibleLicenseAbuse**: Verdächtige Aktivitäten
- **LicenseServerDown**: Service nicht erreichbar
- **HighLicenseValidationLatency**: Antwortzeit > 500ms
- **DatabaseConnectionPoolExhausted**: DB-Verbindungen > 90%

### Neue Alerts hinzufügen

1. Editieren Sie `prometheus/rules/license-server-alerts.yml`
2. Fügen Sie neue Alert-Regel hinzu:

```yaml
- alert: YourAlertName
  expr: your_prometheus_query > threshold
  for: 5m
  labels:
    severity: warning
    service: your-service
  annotations:
    summary: "Alert summary"
    description: "Detailed description"
```

3. Prometheus neu laden:

```bash
curl -X POST http://localhost:9090/-/reload
```

## Dashboards

### License Server Overview

Zeigt wichtige Metriken:
- Aktive Lizenzen
- Validierungen pro Sekunde
- Fehlerrate
- Response Time Percentiles
- Anomalie-Erkennung
- Top 10 aktivste Lizenzen

### Neue Dashboards erstellen

1. In Grafana einloggen
2. Create → Dashboard
3. Panel hinzufügen
4. Prometheus-Query eingeben
5. Dashboard speichern
6. Export als JSON für Backup

## Metriken

### License Server Metriken

- `license_validation_total`: Anzahl der Validierungen
- `license_validation_duration_seconds`: Validierungs-Dauer
- `active_licenses_total`: Aktive Lizenzen
- `anomaly_detections_total`: Erkannte Anomalien

### System Metriken

- `node_cpu_seconds_total`: CPU-Auslastung
- `node_memory_MemAvailable_bytes`: Verfügbarer Speicher
- `node_filesystem_avail_bytes`: Verfügbarer Festplattenspeicher

### Datenbank Metriken

- `pg_stat_database_numbackends`: Aktive DB-Verbindungen
- `pg_stat_database_tup_fetched`: Abgerufene Tupel
- `pg_stat_database_conflicts`: Konflikte

## Troubleshooting

### Prometheus erreicht Service nicht

1. Netzwerk überprüfen:
```bash
docker network inspect v2_internal_net
```

2. Service-Discovery testen:
```bash
docker exec prometheus wget -O- http://license-server:8443/metrics
```

### Keine Daten in Grafana

1. Datasource überprüfen:
   - Settings → Data Sources → Prometheus
   - Test Connection

2. Prometheus Targets checken:
   - http://localhost:9090/targets
   - Alle Targets sollten "UP" sein

### Alerts werden nicht gesendet

1. Alertmanager Logs prüfen:
```bash
docker logs alertmanager
```

2. SMTP-Konfiguration verifizieren
3. Webhook-URLs testen

## Wartung

### Backup

1. Prometheus-Daten:
```bash
docker exec prometheus tar czf /prometheus/backup.tar.gz /prometheus
docker cp prometheus:/prometheus/backup.tar.gz ./backups/
```

2. Grafana-Dashboards:
   - Export über UI als JSON
   - Speichern in `grafana/dashboards/`

### Updates

1. Images updaten:
```bash
docker-compose -f docker-compose.monitoring.yml pull
docker-compose -f docker-compose.monitoring.yml up -d
```

2. Konfiguration neu laden:
```bash
# Prometheus
curl -X POST http://localhost:9090/-/reload

# Alertmanager
curl -X POST http://localhost:9093/-/reload
```

## Performance-Optimierung

### Retention anpassen

In `docker-compose.monitoring.yml`:
```yaml
command:
  - '--storage.tsdb.retention.time=15d'  # Reduzieren für weniger Speicher
```

### Scrape-Intervalle

In `prometheus/prometheus.yml`:
```yaml
global:
  scrape_interval: 30s  # Erhöhen für weniger Last
```

### Resource Limits

Passen Sie die Limits in `docker-compose.monitoring.yml` an Ihre Umgebung an.

## Sicherheit

1. **Grafana**: Ändern Sie das Standard-Passwort sofort
2. **Prometheus**: Kein öffentlicher Zugriff (nur intern)
3. **Alertmanager**: Webhook-URLs geheim halten
4. **Exporters**: Nur im internen Netzwerk erreichbar

## Integration

### In CI/CD Pipeline

```bash
# Deployment-Metriken senden
curl -X POST http://prometheus-pushgateway:9091/metrics/job/deployment \
  -d 'deployment_status{version="1.2.3",environment="production"} 1'
```

### Custom Metriken

In Ihrer Anwendung:
```python
from prometheus_client import Counter, Histogram

custom_metric = Counter('my_custom_total', 'Description')
custom_metric.inc()
```

## Support

Bei Problemen:
1. Logs überprüfen: `docker-compose -f docker-compose.monitoring.yml logs [service]`
2. Dokumentation: https://prometheus.io/docs/
3. Grafana Docs: https://grafana.com/docs/