Pipeline Configuration
Complete reference for the unified pipeline configuration.
Overview
The unified pipeline provides enhanced data quality controls, transformation capabilities, and operational features. Enable it by setting:
pipeline:
enabled: true
Data Quality Limits
Cardinality Limiter
Prevents metric cardinality explosion that can overwhelm backends.
pipeline:
limits:
cardinality:
enabled: true
# Per-metric series limit (unique label combinations)
default_max_series: 10000
# Global limit across all metrics
global_max_series: 100000
# Per-metric overrides for high-cardinality metrics
metric_limits:
http_request_duration_seconds: 50000
api_requests_total: 20000
# How long to remember series (for cleanup)
series_ttl: 1h
# Action when limit reached
# "drop" - silently drop new series
# "hash_labels" - hash label values to reduce cardinality
on_limit: drop
| Parameter | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false | Enable cardinality limiting |
default_max_series |
int | 10000 | Default per-metric series limit |
global_max_series |
int | 100000 | Total series limit across all metrics |
metric_limits |
map | {} | Per-metric overrides |
series_ttl |
duration | 1h | Time to remember series for cleanup |
on_limit |
string | drop | Action when limit reached |
Rate Limiter
Controls data ingestion rate to protect backends.
pipeline:
limits:
rate:
enabled: true
# Maximum data points/spans/logs per second
metrics_per_second: 100000
traces_per_second: 50000
logs_per_second: 200000
# Allow temporary bursts
burst_multiplier: 2.0
# Action when limit reached
on_limit: drop
| Parameter | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false | Enable rate limiting |
metrics_per_second |
int | 100000 | Max metric data points/second |
traces_per_second |
int | 50000 | Max spans/second |
logs_per_second |
int | 200000 | Max log records/second |
burst_multiplier |
float | 2.0 | Allow this multiple for bursts |
on_limit |
string | drop | Action when limit reached |
Attribute Limiter
Controls attribute counts and sizes to reduce payload size.
pipeline:
limits:
attributes:
enabled: true
# Maximum attributes per level
max_resource_attributes: 128
max_scope_attributes: 64
max_data_point_attributes: 32
# Maximum value sizes
max_attribute_value_size: 4096
max_attribute_key_size: 256
# Protected attributes (never dropped or truncated)
protected_attributes:
- service.name
- service.namespace
- k8s.pod.name
- k8s.namespace.name
| Parameter | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false | Enable attribute limiting |
max_resource_attributes |
int | 128 | Max attributes on resource |
max_scope_attributes |
int | 64 | Max attributes on scope |
max_data_point_attributes |
int | 32 | Max attributes on data points |
max_attribute_value_size |
int | 4096 | Max string value length |
max_attribute_key_size |
int | 256 | Max key length |
protected_attributes |
[]string | [] | Never drop or truncate these |
Signal Transformation
Transform Rules
Apply rule-based transformations to signals before export.
pipeline:
transform:
enabled: true
rules:
# Add cluster information
- name: add-cluster-info
enabled: true
match:
signal_types: [metrics, traces, logs]
actions:
- type: set_attribute
set_attribute:
key: k8s.cluster.name
value: production
# Filter debug metrics
- name: drop-debug-metrics
enabled: true
match:
signal_types: [metrics]
metric_names:
- "^debug_.*"
- "^internal_.*"
actions:
- type: filter
filter:
drop: true
# Hash sensitive data
- name: hash-user-ids
enabled: true
match:
signal_types: [traces]
resource_attributes:
service.name: "user-service"
actions:
- type: hash_attribute
hash_attribute:
key: user.id
algorithm: sha256
Rule Structure
| Field | Type | Description |
|---|---|---|
name |
string | Rule identifier |
enabled |
bool | Enable/disable rule |
match |
object | Conditions for applying rule |
actions |
[]object | Actions to perform |
Match Conditions
| Field | Type | Description |
|---|---|---|
signal_types |
[]string | Signal types: metrics, traces, logs |
resource_attributes |
map | Match on resource attribute values |
metric_names |
[]string | Regex patterns for metric names |
span_names |
[]string | Regex patterns for span names |
log_bodies |
[]string | Regex patterns for log bodies |
Action Types
set_attribute - Add or update an attribute
- type: set_attribute
set_attribute:
key: environment
value: ${ENVIRONMENT} # Supports env vars
delete_attribute - Remove an attribute
- type: delete_attribute
delete_attribute:
key: internal.debug.info
rename_attribute - Rename an attribute key
- type: rename_attribute
rename_attribute:
old_key: host.hostname
new_key: host.name
hash_attribute - Hash an attribute value
- type: hash_attribute
hash_attribute:
key: user.id
algorithm: sha256 # sha256, sha512, xxhash
salt: ${HASH_SALT} # Optional salt
filter - Drop matching signals
- type: filter
filter:
drop: true
transform - Regex transformation
- type: transform
transform:
key: http.url
pattern: "([?&])password=[^&]*"
replacement: "${1}password=***"
PII Redaction
Automatically detect and mask personally identifiable information.
pipeline:
pii_redaction:
enabled: true
# Mask string
redaction_string: "[REDACTED]"
# Scan log message bodies (impacts performance)
scan_log_bodies: true
# Scan span names
scan_span_names: false
# Use hash instead of mask (preserves uniqueness)
hash_redaction: false
# Attributes that should never be scanned
allowed_attributes:
- service.name
- k8s.pod.name
- http.route
# PII detection rules
rules:
- name: email
type: email
enabled: true
- name: phone
type: phone
enabled: true
- name: ssn
type: ssn
enabled: true
- name: credit_card
type: credit_card
enabled: true
- name: jwt
type: jwt
enabled: true
- name: api_key
type: api_key
enabled: true
# Custom pattern
- name: internal_id
type: regex
enabled: true
pattern: "INTERNAL-[A-Z0-9]{8}"
Built-in PII Types
| Type | Pattern | Example |
|---|---|---|
email |
Email addresses | user@example.com |
phone |
Phone numbers | 555-123-4567 |
ssn |
Social Security Numbers | 123-45-6789 |
credit_card |
Credit card numbers | 4111-1111-1111-1111 |
ipv4 |
IPv4 addresses | 192.168.1.1 |
ipv6 |
IPv6 addresses | 2001:db8::1 |
jwt |
JWT tokens | eyJhbG... |
api_key |
API keys | sk-xxx, AKIA... |
password |
Password-like strings | (configurable) |
regex |
Custom regex pattern | User-defined |
Export Configuration
OTLP Export
pipeline:
export:
otlp:
endpoint: otel-collector:4317
protocol: grpc # grpc or http
insecure: true
# TLS configuration
tls:
cert_file: /etc/telegen/certs/client.crt
key_file: /etc/telegen/certs/client.key
ca_file: /etc/telegen/certs/ca.crt
insecure_skip_verify: false
# Headers
headers:
X-API-Key: ${OTLP_API_KEY}
Authorization: Bearer ${OTLP_TOKEN}
# Timeouts
timeout: 30s
# Retry configuration
retry:
enabled: true
max_attempts: 3
initial_interval: 1s
max_interval: 30s
backoff_multiplier: 2.0
Batching
pipeline:
export:
batch:
# Items per batch
size: 1000
# Max wait before flush
timeout: 5s
# Minimum batch size to send immediately
send_batch_size: 500
Multi-Endpoint Export
Support failover, round-robin, or fan-out to multiple endpoints.
pipeline:
export:
multi_endpoint:
enabled: true
# Mode: failover, round_robin, fanout
mode: failover
endpoints:
- name: primary
endpoint: primary-collector:4317
priority: 1
- name: secondary
endpoint: secondary-collector:4317
priority: 2
- name: archive
endpoint: archive-collector:4317
mode: fanout # Always send regardless of mode
Persistent Queue
Survive restarts without data loss.
pipeline:
export:
queue:
enabled: true
directory: /var/lib/telegen/queue
max_size_bytes: 500000000 # 500MB
max_items: 100000
Operations
Hot Reload
Reload configuration without restart.
pipeline:
operations:
hot_reload:
enabled: true
# Path to watch
config_path: /etc/telegen/config.yaml
# Check interval for file changes
check_interval: 30s
# Enable SIGHUP reload
enable_sighup: true
# Validation timeout
validation_timeout: 10s
# Auto-rollback on error
rollback_on_error: true
Trigger reload:
# Send SIGHUP
kill -HUP $(pidof telegen)
# systemd
systemctl reload telegen
Graceful Shutdown
Drain in-flight data before stopping.
pipeline:
operations:
shutdown:
# Total shutdown timeout
timeout: 30s
# Time to drain in-flight data
drain_timeout: 10s
# Mark unhealthy during shutdown
enable_health_check: true
Environment Variables
All configuration values support environment variable substitution:
pipeline:
export:
otlp:
endpoint: ${OTLP_ENDPOINT:-otel-collector:4317}
headers:
Authorization: Bearer ${OTLP_TOKEN}
transform:
rules:
- name: add-env
actions:
- type: set_attribute
set_attribute:
key: environment
value: ${ENVIRONMENT:-production}
| Variable | Description |
|---|---|
${VAR} |
Value of VAR, error if unset |
${VAR:-default} |
Value of VAR, or “default” if unset |
${VAR:?error} |
Value of VAR, or error message if unset |
Complete Example
telegen:
mode: agent
service_name: telegen
log_level: info
pipeline:
enabled: true
limits:
cardinality:
enabled: true
default_max_series: 10000
global_max_series: 100000
rate:
enabled: true
metrics_per_second: 100000
traces_per_second: 50000
logs_per_second: 200000
attributes:
enabled: true
max_resource_attributes: 128
protected_attributes:
- service.name
- k8s.namespace.name
transform:
enabled: true
rules:
- name: add-cluster
match:
signal_types: [metrics, traces, logs]
actions:
- type: set_attribute
set_attribute:
key: k8s.cluster.name
value: ${CLUSTER_NAME:-default}
pii_redaction:
enabled: true
scan_log_bodies: true
export:
otlp:
endpoint: ${OTLP_ENDPOINT:-otel-collector:4317}
insecure: true
batch:
size: 1000
timeout: 5s
queue:
enabled: true
directory: /var/lib/telegen/queue
operations:
hot_reload:
enabled: true
enable_sighup: true
shutdown:
timeout: 30s
drain_timeout: 10s
agent:
ebpf:
enabled: true
profiling:
enabled: true
discovery:
enabled: true
self_telemetry:
enabled: true
listen: ":19090"