Agent Mode Configuration
Detailed configuration guide for Telegen Agent mode.
Overview
Agent Mode is the default operating mode for Telegen. In this mode, Telegen runs directly on hosts, collects telemetry using eBPF, and exports data to your OTLP backend.
flowchart LR
subgraph Host["Host System"]
K["Kernel"]
A["Applications"]
TG["Telegen Agent"]
end
K -->|eBPF| TG
A -->|Auto-instrumented| TG
TG -->|OTLP| OC["OTel Collector"]
When to Use Agent Mode
Use Agent Mode when you want to:
- Collect host-level telemetry - CPU, memory, disk, network
- Auto-instrument applications - No code changes required
- Enable distributed tracing - HTTP, gRPC, database calls
- Enable continuous profiling - CPU, memory, off-CPU
- Monitor security events - Syscalls, file integrity
Minimal Agent Configuration
telegen:
mode: agent
otlp:
endpoint: "otel-collector:4317"
eBPF Configuration
Ring Buffer Sizing
Ring buffers are used for high-throughput event streaming from kernel to userspace:
agent:
ebpf:
# Ring buffer size - must be power of 2
# Larger = more buffer, less event loss
# Smaller = less memory usage
ringbuf_size: 16777216 # 16MB (default)
| Size | Use Case |
|---|---|
| 4MB | Low-throughput environments |
| 16MB | Default, balanced |
| 64MB | High-throughput, many connections |
| 256MB | Very high volume, latency-sensitive |
Perf Buffer Sizing
Perf buffers are used for per-CPU event collection:
agent:
ebpf:
perf_buffer_size: 8192 # 8KB per CPU (default)
Network Tracing
agent:
ebpf:
network:
enabled: true
# Protocol tracing
http: true # HTTP/1.1 and HTTP/2
grpc: true # gRPC over HTTP/2
dns: true # DNS queries/responses
# TCP metrics
tcp_metrics: true # RTT, retransmits, connections
# Interface filtering (empty = all interfaces)
interfaces: []
# Exclude by port
exclude_ports:
- 22 # SSH
- 2379 # etcd
Syscall Tracing
agent:
ebpf:
syscalls:
enabled: true
# Include specific syscalls (empty = all)
include: []
# Exclude noisy syscalls
exclude:
- futex
- nanosleep
- clock_gettime
- poll
- select
- epoll_wait
Process Discovery
Telegen discovers which processes to instrument using port-based and/or path-based selection.
Basic Discovery
agent:
discovery:
# Skip services already instrumented with OTel SDKs
exclude_otel_instrumented_services: true
# Process discovery timing
min_process_age: 5s
poll_interval: 5s
Port-Based Discovery (Recommended)
Port-based discovery is more reliable in containerized environments:
agent:
discovery:
instrument:
# Single port
- open_ports: "8080"
# Port range
- open_ports: "8000-8999"
# Multiple ports and ranges
- open_ports: "80,443,3000,8080-8089"
Path-Based Discovery
Discover by executable path pattern (glob syntax):
agent:
discovery:
instrument:
# All Java processes
- exe_path: "*java*"
# Specific application
- exe_path: "/usr/bin/myapp"
# Node.js
- exe_path: "*node*"
Kubernetes-Aware Discovery
agent:
discovery:
instrument:
# By namespace
- k8s_namespace: "production"
# By namespace + port
- k8s_namespace: "production"
open_ports: "8080"
# By pod labels
- k8s_pod_labels:
app: "frontend*"
version: "v2*"
# By annotations
- k8s_pod_annotations:
telegen.io/instrument: "true"
Excluding Services
agent:
discovery:
instrument:
- open_ports: "8080-8089"
exclude_instrument:
# Test namespaces
- k8s_namespace: "*-test"
# Prometheus metrics port
- open_ports: "9090"
# Health check services
- exe_path: "*health*"
# Default exclusions (observability tools)
default_exclude_instrument:
- exe_path: "*telegen*"
- exe_path: "*otelcol*"
- k8s_namespace: "kube-system"
Full Discovery Example
agent:
discovery:
exclude_otel_instrumented_services: true
skip_go_specific_tracers: false
instrument:
# Common app ports
- open_ports: "8080-8089"
- open_ports: "3000,5000"
# Java in production
- exe_path: "*java*"
k8s_namespace: "production"
# Opt-in via annotation
- k8s_pod_annotations:
telegen.io/instrument: "true"
exclude_instrument:
- k8s_namespace: "kube-system"
- open_ports: "9090"
min_process_age: 5s
poll_interval: 5s
Metadata Discovery
Automatic detection of cloud and runtime environments:
agent:
metadata_discovery:
enabled: true
interval: 30s
detect_cloud: true # AWS, GCP, Azure
detect_kubernetes: true # K8s metadata
detect_runtimes: true # Go, Java, Python, Node.js
detect_databases: true # MySQL, PostgreSQL, MongoDB
detect_message_queues: true # Kafka, RabbitMQ, Redis
Runtime Detection
Telegen automatically detects and instruments:
| Runtime | Detection Method | Tracing Support |
|---|---|---|
| Go | Binary analysis, goroutine patterns | ✅ Full |
| Java | JVM process, JFR integration | ✅ Full |
| Python | Interpreter process, frame analysis | ✅ Full |
| Node.js | V8 process detection | ✅ Full |
| .NET | CoreCLR detection | ✅ Full |
| Ruby | Interpreter detection | ⚠️ Partial |
| Rust | Binary analysis | ✅ Full |
Continuous Profiling
Enable CPU, memory, and off-CPU profiling:
agent:
profiling:
enabled: true
# Sampling rate in Hz
sample_rate: 99 # 99 Hz is common to avoid aliasing
# Profile types
cpu: true # On-CPU time
off_cpu: true # Off-CPU waiting time
memory: true # Heap allocations
mutex: true # Lock contention (Go, Java)
block: true # Blocking operations (Go)
goroutine: true # Goroutine profiles (Go only)
# Each profile sample duration
duration: 10s
# How often to upload profiles
upload_interval: 60s
# Symbol resolution
symbols:
demangle_rust: true
demangle_cpp: true
Security Monitoring
Enable runtime security monitoring:
agent:
security:
enabled: true
# Syscall auditing
syscall_audit:
enabled: true
syscalls:
- execve # Process execution
- execveat
- ptrace # Debugging/tracing
- setuid # Privilege changes
- setgid
- mount # Filesystem mounts
- umount
- init_module # Kernel modules
- finit_module
- delete_module
- open_by_handle_at # Filesystem escape
# File integrity monitoring
file_integrity:
enabled: true
paths:
- /etc/passwd
- /etc/shadow
- /etc/sudoers
- /etc/ssh/sshd_config
- /root/.ssh
- /etc/cron.d
- /etc/crontab
recursive: true
events:
- create
- modify
- delete
- chmod
- chown
# Container escape detection
container_escape:
enabled: true
Log Collection
agent:
logs:
enabled: true
# File paths to tail
paths:
- /var/log/syslog
- /var/log/auth.log
- /var/log/*.log
- /var/log/**/*.log
# Collect container logs
container_logs: true
# Exclude patterns
exclude:
- "*.gz"
- "*.zip"
- "*.old"
- "lastlog"
- "wtmp"
- "btmp"
# Multiline log handling
multiline:
enabled: true
pattern: "^\\d{4}-\\d{2}-\\d{2}" # ISO date
negate: true
match: after
max_lines: 500
timeout: 5s
GPU Monitoring
agent:
gpu:
enabled: true
# NVIDIA GPU support (via NVML)
nvidia: true
# AMD GPU support (via ROCm SMI)
amd: false
# Polling interval
poll_interval: 10s
# Metrics to collect
metrics:
utilization: true # GPU utilization %
memory: true # Memory usage
temperature: true # GPU temperature
power: true # Power consumption
clock: true # Clock speeds
pcie_throughput: true # PCIe bandwidth
Resource Limits
agent:
resources:
# CPU limit (number of cores)
cpu_limit: 1.0
# Memory limit
memory_limit: "512Mi"
# Limit concurrent eBPF programs
max_ebpf_programs: 100
# Rate limiting
rate_limit:
spans_per_second: 10000
metrics_per_second: 50000
logs_per_second: 5000
Kubernetes-Specific
When running in Kubernetes, additional features are available:
agent:
kubernetes:
enabled: true
# Enrich with pod metadata
pod_metadata: true
# Enrich with node metadata
node_metadata: true
# Label filtering
label_allowlist:
- "app.kubernetes.io/*"
- "helm.sh/*"
- "app"
- "version"
# Namespace filtering
namespace_include: [] # Empty = all
namespace_exclude:
- kube-system
- kube-public
Example: High-Security Environment
telegen:
mode: agent
log_level: info
otlp:
endpoint: "otel-collector:4317"
tls:
enabled: true
ca_file: "/etc/ssl/certs/ca.crt"
cert_file: "/etc/ssl/certs/client.crt"
key_file: "/etc/ssl/certs/client.key"
agent:
ebpf:
enabled: true
network:
enabled: true
http: true
dns: true
syscalls:
enabled: true
security:
enabled: true
syscall_audit:
enabled: true
file_integrity:
enabled: true
paths:
- /etc
- /root
- /home
recursive: true
container_escape:
enabled: true
profiling:
enabled: true
cpu: true
memory: true
Example: Performance-Optimized
telegen:
mode: agent
log_level: warn
otlp:
endpoint: "otel-collector:4317"
compression: gzip
agent:
ebpf:
enabled: true
ringbuf_size: 67108864 # 64MB
perf_buffer_size: 16384 # 16KB
network:
enabled: true
exclude_ports: [22, 2379, 2380]
syscalls:
enabled: false # Disable for performance
resources:
cpu_limit: 2.0
memory_limit: "1Gi"
rate_limit:
spans_per_second: 50000
metrics_per_second: 100000
Next Steps
- Collector Mode - Remote collection without eBPF
- Environment Variables - Environment variable reference