Pipeline Deployment Guide
Complete deployment guide for Telegen unified pipeline across all environments.
Overview
The unified pipeline provides a unified data path for metrics, traces, and logs with:
- Data Quality Controls: Cardinality limiting, rate limiting, attribute limits
- Signal Transformation: Rule-based transformations with PII redaction
- Flexible Export: Multi-endpoint failover, persistent queuing
- Hot Reload: Configuration changes without restart
- Graceful Shutdown: Drain in-flight data on shutdown
Configuration Reference
All pipeline features are configured under the pipeline section:
pipeline:
enabled: true
limits:
cardinality:
enabled: true
default_max_series: 10000
global_max_series: 100000
rate:
enabled: true
metrics_per_second: 100000
traces_per_second: 50000
logs_per_second: 200000
attributes:
enabled: true
max_resource_attributes: 128
max_attribute_value_size: 4096
transform:
enabled: true
rules:
- name: add-environment
actions:
- type: set_attribute
set_attribute:
key: environment
value: production
pii_redaction:
enabled: true
scan_log_bodies: true
export:
otlp:
endpoint: otel-collector:4317
batch:
size: 1000
timeout: 5s
operations:
hot_reload:
enabled: true
enable_sighup: true
shutdown:
timeout: 30s
drain_timeout: 10s
Bare Metal / Virtual Machines
systemd Deployment
This is the recommended method for Linux servers, VMs, and bare-metal hosts.
Prerequisites
- Linux kernel 4.18+ (5.8+ recommended for full eBPF support)
- systemd
- Root access or CAP_BPF/CAP_SYS_ADMIN capabilities
- Network access to OTLP endpoint
Step 1: Download Binary
# Latest version
VERSION=$(curl -s https://api.github.com/repos/mirastacklabs-ai/telegen/releases/latest \
| grep tag_name | cut -d '"' -f4 | sed 's/release\/mark-v//')
# Download (amd64)
curl -LO "https://github.com/mirastacklabs-ai/telegen/releases/download/release/mark-v${VERSION}/telegen-linux-amd64.tar.gz"
tar xzf telegen-linux-amd64.tar.gz
sudo mv telegen-linux-amd64 /usr/local/bin/telegen
sudo chmod +x /usr/local/bin/telegen
# Verify
telegen --version
For ARM64:
curl -LO "https://github.com/mirastacklabs-ai/telegen/releases/download/release/mark-v${VERSION}/telegen-linux-arm64.tar.gz"
Step 2: Create V3 Configuration
sudo mkdir -p /etc/telegen
cat << 'EOF' | sudo tee /etc/telegen/config.yaml
telegen:
mode: agent
service_name: telegen
log_level: info
# Pipeline Configuration
pipeline:
enabled: true
limits:
cardinality:
enabled: true
default_max_series: 10000
global_max_series: 100000
series_ttl: 1h
rate:
enabled: true
metrics_per_second: 100000
traces_per_second: 50000
logs_per_second: 200000
attributes:
enabled: true
max_resource_attributes: 128
max_attribute_value_size: 4096
protected_attributes:
- service.name
- host.name
pii_redaction:
enabled: true
scan_log_bodies: true
rules:
- name: email
type: email
enabled: true
- name: ssn
type: ssn
enabled: true
transform:
enabled: true
rules:
- name: add-host-info
match:
signal_types: [metrics, traces, logs]
actions:
- type: set_attribute
set_attribute:
key: deployment.environment
value: ${TELEGEN_ENVIRONMENT:-production}
export:
otlp:
endpoint: ${TELEGEN_OTLP_ENDPOINT:-otel-collector:4317}
insecure: ${TELEGEN_OTLP_INSECURE:-true}
batch:
size: 1000
timeout: 5s
queue:
enabled: true
directory: /var/lib/telegen/queue
max_size_bytes: 500000000
operations:
hot_reload:
enabled: true
config_path: /etc/telegen/config.yaml
check_interval: 30s
enable_sighup: true
shutdown:
timeout: 30s
drain_timeout: 10s
# Agent configuration
agent:
ebpf:
enabled: true
network:
enabled: true
http: true
grpc: true
syscalls:
enabled: true
profiling:
enabled: true
cpu: true
memory: true
discovery:
enabled: true
interval: 30s
self_telemetry:
enabled: true
listen: ":19090"
EOF
Step 3: Environment File
cat << 'EOF' | sudo tee /etc/telegen/telegen.env
# OTLP endpoint
TELEGEN_OTLP_ENDPOINT=otel-collector.example.com:4317
TELEGEN_OTLP_INSECURE=false
# Environment tag
TELEGEN_ENVIRONMENT=production
# Optional: API authentication
# OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer your-token
# Logging
TELEGEN_LOG_LEVEL=info
EOF
sudo chmod 600 /etc/telegen/telegen.env
Step 4: systemd Service
cat << 'EOF' | sudo tee /etc/systemd/system/telegen.service
[Unit]
Description=Telegen V3 Observability Agent
Documentation=https://telegen.mirastacklabs.ai
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=root
Group=root
EnvironmentFile=/etc/telegen/telegen.env
ExecStart=/usr/local/bin/telegen --config=/etc/telegen/config.yaml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
LimitNOFILE=65536
LimitMEMLOCK=infinity
# eBPF capabilities
AmbientCapabilities=CAP_SYS_ADMIN CAP_SYS_PTRACE CAP_NET_ADMIN CAP_BPF CAP_PERFMON CAP_SYS_RESOURCE CAP_DAC_READ_SEARCH
NoNewPrivileges=false
StandardOutput=journal
StandardError=journal
SyslogIdentifier=telegen
[Install]
WantedBy=multi-user.target
EOF
Step 5: Start Service
sudo systemctl daemon-reload
sudo systemctl enable telegen
sudo systemctl start telegen
# Check status
sudo systemctl status telegen
# View logs
sudo journalctl -u telegen -f
Hot Reload Configuration
# Edit configuration
sudo vim /etc/telegen/config.yaml
# Reload without restart
sudo systemctl reload telegen
# Or send SIGHUP directly
sudo kill -HUP $(pidof telegen)
Docker Compose
Single Node Agent
# docker-compose.yaml
version: '3.8'
services:
telegen:
image: ghcr.io/mirastacklabs-ai/telegen:latest
container_name: telegen
restart: unless-stopped
privileged: true
pid: host
network_mode: host
environment:
- TELEGEN_OTLP_ENDPOINT=otel-collector:4317
- TELEGEN_ENVIRONMENT=production
- TELEGEN_LOG_LEVEL=info
volumes:
- /sys:/sys:ro
- /proc:/host/proc:ro
- /sys/kernel/debug:/sys/kernel/debug
- /sys/fs/bpf:/sys/fs/bpf
- ./configs/telegen.yaml:/etc/telegen/config.yaml:ro
- telegen-queue:/var/lib/telegen/queue
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:19090/healthz"]
interval: 30s
timeout: 10s
retries: 3
volumes:
telegen-queue:
Full Observability Stack
# docker-compose.full.yaml
version: '3.8'
services:
# Telegen Agent
telegen:
image: ghcr.io/mirastacklabs-ai/telegen:latest
container_name: telegen
restart: unless-stopped
privileged: true
pid: host
network_mode: host
environment:
- TELEGEN_OTLP_ENDPOINT=localhost:4317
volumes:
- /sys:/sys:ro
- /proc:/host/proc:ro
- /sys/kernel/debug:/sys/kernel/debug
- /sys/fs/bpf:/sys/fs/bpf
- ./configs/agent.yaml:/etc/telegen/config.yaml:ro
# OpenTelemetry Collector
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
container_name: otel-collector
restart: unless-stopped
command: ["--config=/etc/otel/config.yaml"]
volumes:
- ./configs/otel-collector.yaml:/etc/otel/config.yaml:ro
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
- "8888:8888" # Metrics
# Prometheus
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.enable-remote-write-receiver'
volumes:
- ./configs/prometheus.yaml:/etc/prometheus/prometheus.yml:ro
- prometheus-data:/prometheus
ports:
- "9090:9090"
# Loki
loki:
image: grafana/loki:latest
container_name: loki
restart: unless-stopped
command: -config.file=/etc/loki/config.yaml
volumes:
- ./configs/loki.yaml:/etc/loki/config.yaml:ro
- loki-data:/loki
ports:
- "3100:3100"
# Grafana
grafana:
image: grafana/grafana:latest
container_name: grafana
restart: unless-stopped
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana-data:/var/lib/grafana
ports:
- "3000:3000"
volumes:
prometheus-data:
loki-data:
grafana-data:
Collector Mode (Non-eBPF)
# docker-compose.collector.yaml
version: '3.8'
services:
telegen-collector:
image: ghcr.io/mirastacklabs-ai/telegen:latest
container_name: telegen-collector
restart: unless-stopped
environment:
- TELEGEN_OTLP_ENDPOINT=otel-collector:4317
volumes:
- ./configs/collector.yaml:/etc/telegen/config.yaml:ro
ports:
- "19090:19090" # Health/metrics
# No privileged mode needed for collector mode
With collector config:
# configs/collector.yaml
telegen:
mode: collector
service_name: telegen-collector
pipeline:
enabled: true
limits:
cardinality:
enabled: true
default_max_series: 50000
pii_redaction:
enabled: true
export:
otlp:
endpoint: ${TELEGEN_OTLP_ENDPOINT}
collectors:
prometheus:
enabled: true
scrape_interval: 30s
targets:
- name: node-exporter
address: node-exporter:9100
- name: cadvisor
address: cadvisor:8080
metrics_path: /metrics
Kubernetes
DaemonSet (Agent Mode)
# telegen-daemonset.yaml
apiVersion: v1
kind: Namespace
metadata:
name: telegen
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: telegen
namespace: telegen
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: telegen
rules:
- apiGroups: [""]
resources: ["nodes", "pods", "services", "endpoints", "namespaces", "configmaps"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments", "replicasets", "daemonsets", "statefulsets"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["nodes/proxy", "nodes/stats"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: telegen
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: telegen
subjects:
- kind: ServiceAccount
name: telegen
namespace: telegen
---
apiVersion: v1
kind: ConfigMap
metadata:
name: telegen-config
namespace: telegen
data:
config.yaml: |
telegen:
mode: agent
service_name: telegen
log_level: info
pipeline:
enabled: true
limits:
cardinality:
enabled: true
default_max_series: 10000
global_max_series: 100000
rate:
enabled: true
metrics_per_second: 100000
attributes:
enabled: true
protected_attributes:
- service.name
- k8s.namespace.name
- k8s.pod.name
pii_redaction:
enabled: true
scan_log_bodies: true
transform:
enabled: true
rules:
- name: add-cluster
match:
signal_types: [metrics, traces, logs]
actions:
- type: set_attribute
set_attribute:
key: k8s.cluster.name
value: ${CLUSTER_NAME}
export:
otlp:
endpoint: ${OTLP_ENDPOINT}
insecure: true
batch:
size: 1000
timeout: 5s
queue:
enabled: true
directory: /var/lib/telegen/queue
max_size_bytes: 100000000
operations:
hot_reload:
enabled: true
enable_sighup: true
shutdown:
timeout: 30s
drain_timeout: 10s
agent:
ebpf:
enabled: true
network:
enabled: true
profiling:
enabled: true
discovery:
enabled: true
kube_metrics:
enabled: true
self_telemetry:
enabled: true
listen: ":19090"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: telegen
namespace: telegen
labels:
app: telegen
spec:
selector:
matchLabels:
app: telegen
template:
metadata:
labels:
app: telegen
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "19090"
spec:
serviceAccountName: telegen
hostPID: true
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
tolerations:
- operator: Exists
effect: NoSchedule
- operator: Exists
effect: NoExecute
containers:
- name: telegen
image: ghcr.io/mirastacklabs-ai/telegen:latest
imagePullPolicy: IfNotPresent
args:
- --config=/etc/telegen/config.yaml
env:
- name: OTLP_ENDPOINT
value: "otel-collector.observability:4317"
- name: CLUSTER_NAME
value: "production"
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/telegen
readOnly: true
- name: sys
mountPath: /sys
readOnly: true
- name: proc
mountPath: /host/proc
readOnly: true
- name: debugfs
mountPath: /sys/kernel/debug
- name: bpffs
mountPath: /sys/fs/bpf
- name: queue
mountPath: /var/lib/telegen/queue
livenessProbe:
httpGet:
path: /healthz
port: 19090
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /readyz
port: 19090
initialDelaySeconds: 10
periodSeconds: 10
volumes:
- name: config
configMap:
name: telegen-config
- name: sys
hostPath:
path: /sys
- name: proc
hostPath:
path: /proc
- name: debugfs
hostPath:
path: /sys/kernel/debug
- name: bpffs
hostPath:
path: /sys/fs/bpf
- name: queue
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: telegen
namespace: telegen
labels:
app: telegen
spec:
type: ClusterIP
ports:
- name: metrics
port: 19090
targetPort: 19090
selector:
app: telegen
Apply to Cluster
# Create resources
kubectl apply -f telegen-daemonset.yaml
# Verify
kubectl -n telegen get pods
kubectl -n telegen logs -l app=telegen -f
# Check health
kubectl -n telegen exec ds/telegen -- curl -s localhost:19090/healthz
Helm
Quick Start
# Add repository
helm repo add telegen https://charts.mirastacklabs.ai
helm repo update
# Install with defaults
helm install telegen telegen/telegen -n telegen --create-namespace
# Install with custom values
helm install telegen telegen/telegen -n telegen --create-namespace \
--set otlp.endpoint=otel-collector:4317 \
--set pipeline.enabled=true \
--set pipeline.limits.cardinality.enabled=true
Custom Values
# values.yaml
replicaCount: 1 # For DaemonSet, this is ignored
image:
repository: ghcr.io/mirastacklabs-ai/telegen
tag: latest
pullPolicy: IfNotPresent
# OTLP configuration
otlp:
endpoint: otel-collector.observability:4317
insecure: true
headers: {}
# Pipeline configuration
v3Pipeline:
enabled: true
limits:
cardinality:
enabled: true
defaultMaxSeries: 10000
globalMaxSeries: 100000
rate:
enabled: true
metricsPerSecond: 100000
tracesPerSecond: 50000
logsPerSecond: 200000
attributes:
enabled: true
maxResourceAttributes: 128
protectedAttributes:
- service.name
- k8s.namespace.name
piiRedaction:
enabled: true
scanLogBodies: true
transform:
enabled: true
rules:
- name: add-cluster
actions:
- type: set_attribute
setAttribute:
key: k8s.cluster.name
value: "production"
export:
batch:
size: 1000
timeout: 5s
queue:
enabled: true
maxSizeBytes: 100000000
operations:
hotReload:
enabled: true
shutdown:
timeout: 30s
# Agent configuration
agent:
ebpf:
enabled: true
network: true
profiling:
enabled: true
discovery:
enabled: true
# Resources
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
# Pod tolerations for scheduling on all nodes
tolerations:
- operator: Exists
effect: NoSchedule
- operator: Exists
effect: NoExecute
# Service monitor for Prometheus Operator
serviceMonitor:
enabled: true
interval: 30s
Install with Values File
helm install telegen telegen/telegen -n telegen --create-namespace -f values.yaml
OpenShift
OpenShift requires additional security context constraints (SCC).
Create SCC
# telegen-scc.yaml
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: telegen-scc
allowHostDirVolumePlugin: true
allowHostIPC: true
allowHostNetwork: true
allowHostPID: true
allowHostPorts: true
allowPrivilegedContainer: true
allowedCapabilities:
- SYS_ADMIN
- SYS_PTRACE
- NET_ADMIN
- BPF
- PERFMON
- SYS_RESOURCE
- DAC_READ_SEARCH
fsGroup:
type: RunAsAny
readOnlyRootFilesystem: false
runAsUser:
type: RunAsAny
seLinuxContext:
type: RunAsAny
supplementalGroups:
type: RunAsAny
users:
- system:serviceaccount:telegen:telegen
volumes:
- '*'
Apply and Deploy
# Create SCC
oc apply -f telegen-scc.yaml
# Create project
oc new-project telegen
# Deploy (use the Kubernetes DaemonSet YAML from above)
oc apply -f telegen-daemonset.yaml
# Verify
oc get pods -n telegen
AWS ECS
Task Definition
{
"family": "telegen",
"networkMode": "host",
"pidMode": "host",
"requiresCompatibilities": ["EC2"],
"executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::ACCOUNT:role/telegenTaskRole",
"containerDefinitions": [
{
"name": "telegen",
"image": "ghcr.io/mirastacklabs-ai/telegen:latest",
"essential": true,
"privileged": true,
"environment": [
{"name": "TELEGEN_OTLP_ENDPOINT", "value": "otel-collector.internal:4317"},
{"name": "TELEGEN_ENVIRONMENT", "value": "production"}
],
"mountPoints": [
{"sourceVolume": "sys", "containerPath": "/sys", "readOnly": true},
{"sourceVolume": "proc", "containerPath": "/host/proc", "readOnly": true},
{"sourceVolume": "debugfs", "containerPath": "/sys/kernel/debug"},
{"sourceVolume": "bpffs", "containerPath": "/sys/fs/bpf"}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/telegen",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "telegen"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:19090/healthz || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3
},
"cpu": 256,
"memory": 512
}
],
"volumes": [
{"name": "sys", "host": {"sourcePath": "/sys"}},
{"name": "proc", "host": {"sourcePath": "/proc"}},
{"name": "debugfs", "host": {"sourcePath": "/sys/kernel/debug"}},
{"name": "bpffs", "host": {"sourcePath": "/sys/fs/bpf"}}
]
}
Service (Daemon Scheduling)
{
"serviceName": "telegen",
"cluster": "production",
"taskDefinition": "telegen",
"schedulingStrategy": "DAEMON",
"deploymentConfiguration": {
"maximumPercent": 100,
"minimumHealthyPercent": 0
}
}
Verification
After deployment, verify Telegen is working:
# Check health
curl http://localhost:19090/healthz
# Expected: {"status":"healthy"}
# Check readiness
curl http://localhost:19090/readyz
# Expected: {"status":"ready"}
# Check metrics
curl http://localhost:19090/metrics | head -20
# Check pipeline stats
curl http://localhost:19090/debug/pipeline/stats
# View logs
# systemd: journalctl -u telegen -f
# Docker: docker logs telegen -f
# Kubernetes: kubectl -n telegen logs -l app=telegen -f
Troubleshooting
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| eBPF not starting | Missing capabilities | Run privileged or add CAP_BPF |
| No metrics exported | OTLP endpoint unreachable | Check network/firewall |
| High memory usage | Cardinality explosion | Enable cardinality limits |
| Config not reloading | SIGHUP not working | Check hot_reload.enabled |
| Data loss on restart | No persistent queue | Enable queue.enabled |
Debug Commands
# Check kernel version
uname -r # Must be 4.18+
# Check eBPF support
ls /sys/fs/bpf
# Check capabilities (container)
capsh --print
# Test OTLP connectivity
nc -zv otel-collector 4317
Logs
# Increase log level
# In config.yaml: log_level: debug
# Filter errors only
journalctl -u telegen | grep -i error
Next Steps
- Full Reference - Complete configuration reference
- Features - Feature documentation
- Monitoring - Monitoring Telegen itself
- Performance Tuning - Optimization guide