Log Collection & Trace Enrichment
Telegen provides native log collection with automatic trace correlation, delivering 100% OTLP-compliant logs.
Overview
Telegen’s log pipeline offers:
- Native container runtime parsers - Docker JSON, containerd, CRI-O
- Automatic K8s metadata extraction - Pod, namespace, container from log paths
- eBPF-based trace injection - Link logs to distributed traces
- Application-aware parsing - Spring Boot, Log4j, Logback patterns
- Non-Kubernetes support - File path-based correlation for VMs
No sidecar containers. No separate log shippers. One unified agent.
How It Works
flowchart TB
subgraph Sources["Log Sources"]
A["Container Logs\n/var/log/containers/"]
B["Application Logs\n/var/log/myapp/"]
C["System Logs\n/var/log/syslog"]
end
subgraph Telegen["Telegen Agent"]
subgraph eBPF["eBPF Layer"]
E["Log Enricher\n(write syscall intercept)"]
T["Trace Context Cache"]
end
subgraph Pipeline["Log Pipeline"]
F["Filelog Reader"]
P["Runtime Parsers"]
TE["Trace Enricher"]
K["K8s Metadata"]
end
end
A --> F
B --> F
C --> F
F --> P
P --> TE
E --> T
T --> TE
TE --> K
K --> O["OTLP Exporter"]
O --> C1["OTel Collector\nor Backend"]
The Correlation Bridge
Telegen uses a unique approach to correlate plain-text logs with distributed traces:
- eBPF captures trace context - When an application writes a log line, the
log_enricherintercepts thewrite()syscall and captures the current trace context - Time-windowed cache - Trace context is stored in a correlation cache keyed by container ID (K8s) or file path (non-K8s) + timestamp
- Filelog enrichment - When the log pipeline reads the log file, it queries the cache using timestamp matching (±100ms tolerance)
- OTLP emission - The enriched log record includes
trace_id,span_id, andtrace_flags
This approach works for any log format, including plain-text logs without native trace context.
Configuration
Basic Log Collection
logs:
enabled: true
filelog:
enabled: true
# Container logs (Kubernetes)
include:
- /var/log/containers/*.log
- /var/log/pods/*/*/*/*.log
# Application logs (non-K8s)
include:
- /var/log/myapp/*.log
# Automatic runtime detection
auto_detect_runtime: true
# K8s metadata from log paths
k8s_metadata_from_path: true
Full Configuration with Trace Enrichment
logs:
enabled: true
# eBPF log enricher - captures trace context at write time
log_enricher:
enabled: true
# Target files to track (supports glob patterns)
target_paths:
- /var/log/containers/*.log
- /var/log/pods/*/*/*/*.log
- /var/log/myapp/*.log
# JSON injection (only works for JSON logs)
json_injection: true
# Record trace context for filelog correlation (works for all formats)
filelog_correlation: true
filelog:
enabled: true
include:
- /var/log/containers/*.log
- /var/log/pods/*/*/*/*.log
- /var/log/myapp/*.log
auto_detect_runtime: true
k8s_metadata_from_path: true
# Trace context enrichment from eBPF correlation
trace_context_enrichment:
enabled: true
# Time window for matching log timestamps to trace context
tolerance: 100ms
# Application-specific parsers
application_parsers:
enabled: true
patterns:
- spring_boot
- log4j
- logback
- python_logging
Log Enricher (eBPF)
The log_enricher uses eBPF to capture trace context at the moment of log emission.
How It Works
sequenceDiagram
participant App as Application
participant Kernel as Linux Kernel
participant eBPF as Log Enricher
participant Cache as Correlation Cache
participant Filelog as Filelog Reader
App->>Kernel: write(fd, "User login failed", 17)
Kernel->>eBPF: kprobe:vfs_write
eBPF->>eBPF: Extract trace context from goroutine/thread
eBPF->>Cache: Store(key=cid:abc123, ts=1706745600, trace=...)
Kernel->>App: bytes written
Note over Filelog: 50ms later...
Filelog->>Filelog: Read line from file
Filelog->>Cache: Lookup(key=cid:abc123, ts=1706745600, tolerance=100ms)
Cache->>Filelog: TraceContext{traceID, spanID, flags}
Filelog->>Filelog: Enrich OTLP record
Supported Log Formats
| Format | JSON Injection | Filelog Correlation |
|---|---|---|
| JSON (structured) | ✅ Direct injection | ✅ Via cache |
| Plain text | ❌ Not possible | ✅ Via cache |
| Log4j pattern | ❌ Not possible | ✅ Via cache |
| Spring Boot default | ❌ Not possible | ✅ Via cache |
JSON Injection Example
When json_injection: true, the log enricher directly modifies JSON logs:
Before:
{"timestamp":"2024-01-31T10:30:00Z","level":"INFO","message":"User login successful","user_id":"12345"}
After:
{"timestamp":"2024-01-31T10:30:00Z","level":"INFO","message":"User login successful","user_id":"12345","trace_id":"4bf92f3577b34da6a3ce929d0e0e4736","span_id":"00f067aa0ba902b7","trace_flags":"01"}
Container Runtime Parsers
Telegen includes native parsers for all major container runtimes.
Docker JSON
{"log":"2024-01-31T10:30:00.000Z INFO Starting application\n","stream":"stdout","time":"2024-01-31T10:30:00.123456789Z"}
Extracted fields:
body:2024-01-31T10:30:00.000Z INFO Starting applicationtimestamp:2024-01-31T10:30:00.123456789Zstream:stdout
containerd / CRI-O
2024-01-31T10:30:00.123456789Z stdout F 2024-01-31T10:30:00.000Z INFO Starting application
Extracted fields:
body:2024-01-31T10:30:00.000Z INFO Starting applicationtimestamp:2024-01-31T10:30:00.123456789Zstream:stdoutpartial:false(F = full line)
Kubernetes Metadata
Telegen extracts Kubernetes metadata from log file paths.
Path Format
/var/log/containers/nginx-7b5f8d4c9b-x2kpq_default_nginx-abc123def456.log
↓ ↓ ↓ ↓
pod_name pod_uid namespace container_name
Extracted Resource Attributes
resource:
attributes:
k8s.pod.name: nginx-7b5f8d4c9b-x2kpq
k8s.pod.uid: 7b5f8d4c9b
k8s.namespace.name: default
k8s.container.name: nginx
container.id: abc123def456
k8s.node.name: worker-01 # From agent metadata
Application Parsers
Telegen includes parsers for common application log formats.
Spring Boot
2024-01-31 10:30:00.123 INFO 1234 --- [main] c.e.MyApplication : Application started
Extracted:
timestamp: "2024-01-31T10:30:00.123Z"
severity_number: 9 # INFO
severity_text: "INFO"
attributes:
process.pid: 1234
thread.name: "main"
code.namespace: "c.e.MyApplication"
body: "Application started"
Log4j / Logback
2024-01-31 10:30:00,123 [main] INFO com.example.Service - Processing request id=12345
Extracted:
timestamp: "2024-01-31T10:30:00.123Z"
severity_number: 9
severity_text: "INFO"
attributes:
thread.name: "main"
code.namespace: "com.example.Service"
body: "Processing request id=12345"
Non-Kubernetes Environments
Telegen supports trace correlation in non-Kubernetes environments using file path-based correlation.
How It Works
In non-K8s environments (VMs, bare metal):
- eBPF identifies log file - Uses the file path as correlation key (
path:/var/log/myapp/application.log) - Same key at read time - Filelog uses the same file path for lookup
- Correlation matches - Trace context retrieved and attached
Configuration
logs:
log_enricher:
enabled: true
target_paths:
- /var/log/myapp/*.log
- /opt/app/logs/*.log
filelog_correlation: true
filelog:
enabled: true
include:
- /var/log/myapp/*.log
- /opt/app/logs/*.log
trace_context_enrichment:
enabled: true
tolerance: 100ms
Correlation Keys
| Environment | Key Format | Example |
|---|---|---|
| Kubernetes | cid:<container_id> |
cid:abc123def456 |
| Non-K8s | path:<file_path> |
path:/var/log/myapp/app.log |
OTLP Compliance
All logs are emitted as fully OTLP-compliant log records.
Log Record Structure
# Full OTLP LogRecord
resource:
attributes:
service.name: "my-service"
k8s.pod.name: "my-service-7b5f8d4c9b-x2kpq"
k8s.namespace.name: "production"
k8s.node.name: "worker-01"
host.name: "worker-01"
scope:
name: "telegen.logs"
version: "2.10.1"
log_record:
time_unix_nano: 1706695800123456789
observed_time_unix_nano: 1706695800200000000
severity_number: 9
severity_text: "INFO"
body:
string_value: "User login successful"
attributes:
- key: "user_id"
value: { string_value: "12345" }
- key: "log.file.path"
value: { string_value: "/var/log/containers/my-service_default_app-abc123.log" }
trace_id: "4bf92f3577b34da6a3ce929d0e0e4736"
span_id: "00f067aa0ba902b7"
flags: 1
Semantic Conventions
Telegen follows OpenTelemetry semantic conventions:
| Attribute | Description |
|---|---|
log.file.path |
Source file path |
log.iostream |
stdout or stderr |
container.runtime |
docker, containerd, cri-o |
exception.type |
Exception class name |
exception.message |
Exception message |
exception.stacktrace |
Full stack trace |
Troubleshooting
Trace Context Not Appearing
- Check log_enricher is enabled:
telegen config validate | grep log_enricher - Verify eBPF programs loaded:
telegen status | grep -A5 "Log Enricher" - Check tolerance window:
- Default 100ms should work for most cases
- Increase if logs have high latency to disk
Missing Kubernetes Metadata
- Verify log path format:
ls -la /var/log/containers/ - Check symlink resolution:
/var/log/containers/*.logshould symlink to/var/log/pods/
Performance Tuning
logs:
filelog:
# Batch size for OTLP export
batch_size: 1000
# Flush interval
flush_interval: 1s
# Max concurrent files
max_concurrent_files: 100
log_enricher:
# Correlation cache settings
cache_ttl: 30s
cache_max_entries: 100000
cache_bucket_size: 100ms
Metrics
Telegen exposes self-monitoring metrics for the log pipeline:
# Log lines processed
telegen_logs_records_total{source="filelog", k8s_namespace="default"} 1234567
# Trace enrichment success rate
telegen_logs_trace_enriched_total{method="cache"} 98765
telegen_logs_trace_enriched_total{method="json"} 45678
# Cache hit rate
telegen_correlation_cache_hits_total 98765
telegen_correlation_cache_misses_total 1234
# Parse errors
telegen_logs_parse_errors_total{parser="docker_json"} 12