Telegen Signals Reference
Complete catalog of all metrics and traces generated by Telegen, organized by source module.
This document is the authoritative reference for all observability signals emitted by Telegen. Each signal includes its name, definition, operational impact, sentiment (whether increase is positive or negative), and source module.
Table of Contents
- eBPF Application Metrics
- eBPF Application Traces
- Network Flow Metrics
- Service Graph Metrics
- Span Metrics
- Profiler Metrics
- Node Exporter Metrics
- Kubernetes State Metrics
- GPU / AI-ML Metrics
- Self-Telemetry Metrics
- DNS Metrics
eBPF Application Metrics
Metrics generated by Telegen’s eBPF-based distributed tracing for HTTP, gRPC, database, and messaging clients/servers.
HTTP Server Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
http_server_request_duration_seconds |
Histogram |
Duration of HTTP service calls from the server side, in seconds |
http_request_method, http_response_status_code, url_path, http_route, service_name, service_namespace |
Latency indicator for server-side HTTP handling |
⬆️ Negative (higher latency = degraded performance) |
pkg/export/prom |
http_server_request_body_size_bytes |
Histogram |
Size of the HTTP request body as received at the server side, in bytes |
http_request_method, http_response_status_code, service_name |
Tracks incoming payload sizes |
Neutral |
pkg/export/prom |
http_server_response_body_size_bytes |
Histogram |
Size of the HTTP response body as sent from the server side, in bytes |
http_request_method, http_response_status_code, service_name |
Tracks outgoing payload sizes |
Neutral |
pkg/export/prom |
HTTP Client Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
http_client_request_duration_seconds |
Histogram |
Duration of HTTP service calls from the client side, in seconds |
http_request_method, http_response_status_code, server_address, service_name |
Latency of outbound HTTP calls |
⬆️ Negative (higher = slower dependencies) |
pkg/export/prom |
http_client_request_body_size_bytes |
Histogram |
Size of the HTTP request body as sent from the client side, in bytes |
http_request_method, server_address, service_name |
Tracks outbound request sizes |
Neutral |
pkg/export/prom |
http_client_response_body_size_bytes |
Histogram |
Size of the HTTP response body as received at the client side, in bytes |
http_request_method, server_address, service_name |
Tracks inbound response sizes |
Neutral |
pkg/export/prom |
gRPC Server Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
rpc_server_duration_seconds |
Histogram |
Duration of RPC service calls from the server side, in seconds |
rpc_method, rpc_system, rpc_grpc_status_code, service_name |
gRPC server latency |
⬆️ Negative |
pkg/export/prom |
gRPC Client Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
rpc_client_duration_seconds |
Histogram |
Duration of gRPC service calls from the client side, in seconds |
rpc.method, rpc.system, rpc.grpc.status_code, server.address, service.name |
gRPC client latency |
⬆️ Negative |
pkg/export/prom |
Database Client Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
db_client_operation_duration_seconds |
Histogram |
Duration of database client operations, in seconds |
db.system.name, db.operation.name, db.collection.name, server.address, service.name |
Database query latency |
⬆️ Negative (slow queries impact performance) |
pkg/export/prom |
Messaging Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
messaging_publish_duration_seconds |
Histogram |
Duration of messaging client publish operations, in seconds |
messaging.system, messaging.destination.name, messaging.operation.type, service.name |
Message publish latency |
⬆️ Negative |
pkg/export/prom |
messaging_process_duration_seconds |
Histogram |
Duration of messaging client process operations, in seconds |
messaging.system, messaging.destination.name, messaging.operation.type, service.name |
Message processing latency |
⬆️ Negative |
pkg/export/prom |
eBPF Application Traces
Distributed traces generated automatically by Telegen’s eBPF instrumentation.
HTTP Traces
| Span Name |
Span Kind |
Definition |
Key Attributes |
Impact |
Sentiment |
Source Module |
{http.method} {url.path} |
Server |
HTTP server request span |
http.request.method, http.response.status_code, url.path, http.route, client.address, server.address, server.port, http.request.body.size, http.response.body.size |
Represents inbound HTTP request |
Error rate ⬆️ Negative |
pkg/export/otel/tracesgen |
{http.method} {url.full} |
Client |
HTTP client request span |
http.request.method, http.response.status_code, url.full, server.address, peer.service, server.port |
Represents outbound HTTP call |
Error rate ⬆️ Negative |
pkg/export/otel/tracesgen |
in queue |
Internal |
Time spent waiting in request queue |
Parent span reference |
Queue wait time indicator |
Duration ⬆️ Negative |
pkg/export/otel/tracesgen |
processing |
Internal |
Time spent processing request |
Parent span reference |
Processing time indicator |
Duration ⬆️ Negative |
pkg/export/otel/tracesgen |
gRPC Traces
| Span Name |
Span Kind |
Definition |
Key Attributes |
Impact |
Sentiment |
Source Module |
{rpc.method} |
Server |
gRPC server call span |
rpc.method, rpc.system, rpc.grpc.status_code, client.address, server.address, server.port |
Represents inbound gRPC call |
Error status ⬆️ Negative |
pkg/export/otel/tracesgen |
{rpc.method} |
Client |
gRPC client call span |
rpc.method, rpc.system, rpc.grpc.status_code, server.address, peer.service, server.port |
Represents outbound gRPC call |
Error status ⬆️ Negative |
pkg/export/otel/tracesgen |
Database Traces
| Span Name |
Span Kind |
Definition |
Key Attributes |
Impact |
Sentiment |
Source Module |
{db.operation.name} {db.collection.name} |
Client |
SQL database client span |
db.system.name, db.operation.name, db.collection.name, db.query.text (optional), server.address, db.response.status_code |
SQL query execution |
Error/slow queries ⬆️ Negative |
pkg/export/otel/tracesgen |
{db.operation.name} |
Client |
Redis client span |
db.system.name (redis), db.operation.name, db.query.text (optional), server.address, db.namespace |
Redis command execution |
Latency ⬆️ Negative |
pkg/export/otel/tracesgen |
{db.operation.name} {db.collection.name} |
Client |
MongoDB client span |
db.system.name (mongodb), db.operation.name, db.collection.name, server.address, db.namespace |
MongoDB operation |
Latency ⬆️ Negative |
pkg/export/otel/tracesgen |
{db.operation.name} {db.collection.name} |
Client |
Couchbase client span |
db.system.name (couchbase), db.operation.name, db.collection.name, server.address, db.namespace |
Couchbase operation |
Latency ⬆️ Negative |
pkg/export/otel/tracesgen |
Messaging Traces
| Span Name |
Span Kind |
Definition |
Key Attributes |
Impact |
Sentiment |
Source Module |
{messaging.destination.name} {messaging.operation.type} |
Client/Producer |
Kafka client span |
messaging.system (kafka), messaging.destination.name, messaging.client.id, messaging.operation.type, messaging.destination.partition.id, messaging.kafka.offset |
Kafka produce/consume |
Lag ⬆️ Negative |
pkg/export/otel/tracesgen |
{messaging.destination.name} {messaging.operation.type} |
Client |
MQTT client span |
messaging.system (mqtt), messaging.destination.name, messaging.client.id, messaging.operation.type |
MQTT publish/subscribe |
Latency ⬆️ Negative |
pkg/export/otel/tracesgen |
DNS Traces
| Span Name |
Span Kind |
Definition |
Key Attributes |
Impact |
Sentiment |
Source Module |
DNS {dns.question.name} |
Client |
DNS lookup span |
dns.question.name, dns.answers, client.address, server.address, server.port, error.message (on failure) |
DNS resolution |
Lookup failures ⬆️ Negative |
pkg/export/otel/tracesgen |
GPU Traces
| Span Name |
Span Kind |
Definition |
Key Attributes |
Impact |
Sentiment |
Source Module |
CUDA Kernel Launch |
Internal |
GPU kernel launch event |
cuda.kernel.name, grid/block size |
GPU kernel execution |
Kernel time ⬆️ Negative |
pkg/export/otel/tracesgen |
CUDA Malloc |
Internal |
GPU memory allocation |
Memory size |
GPU memory allocation |
Failures ⬆️ Negative |
pkg/export/otel/tracesgen |
CUDA Memcpy |
Internal |
GPU memory copy |
cuda.memcpy.kind, size |
Data transfer |
Duration ⬆️ Negative |
pkg/export/otel/tracesgen |
Network Flow Metrics
Network-level metrics collected via eBPF for flow analysis and inter-zone traffic monitoring.
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
obi_network_flow_bytes_total |
Counter |
Bytes submitted from a source network endpoint to a destination network endpoint |
src.address, dst.address, src.port, dst.port, src.name, dst.name, direction, transport, k8s.src.owner.name, k8s.dst.owner.name, k8s.src.namespace, k8s.dst.namespace |
Network traffic volume |
⬆️ Neutral (high traffic may indicate load) |
pkg/export/prom/prom_net |
obi_network_inter_zone_bytes_total |
Counter |
Bytes submitted between different cloud availability zones |
src.zone, dst.zone, k8s.cluster.name |
Cross-zone egress costs |
⬆️ Negative (higher = more egress costs) |
pkg/export/prom/prom_net |
Service Graph Metrics
Metrics for visualizing service-to-service dependencies in trace formats compatible with Grafana Tempo.
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
traces_service_graph_request_client_seconds |
Histogram |
Duration of client service calls, in seconds, in trace service graph format |
client, server, client_service_namespace, server_service_namespace, connection_type |
Client-side latency in service graph |
⬆️ Negative |
pkg/export/otel/metrics_svc_graph |
traces_service_graph_request_server_seconds |
Histogram |
Duration of server service calls, in seconds, in trace service graph format |
client, server, client_service_namespace, server_service_namespace, connection_type |
Server-side latency in service graph |
⬆️ Negative |
pkg/export/otel/metrics_svc_graph |
traces_service_graph_request_total |
Counter |
Number of service calls in trace service graph format |
client, server, client_service_namespace, server_service_namespace, connection_type |
Request volume between services |
Neutral |
pkg/export/otel/metrics_svc_graph |
traces_service_graph_request_failed_total |
Counter |
Number of failed service calls in trace service graph format |
client, server, client_service_namespace, server_service_namespace, connection_type |
Failed calls between services |
⬆️ Negative (failures indicate issues) |
pkg/export/otel/metrics_svc_graph |
Span Metrics
RED metrics derived from distributed traces, compatible with Grafana Tempo span metrics and the OTel spanmetricsconnector.
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
traces_spanmetrics_latency |
Histogram |
Duration of service calls (client and server). Legacy Tempo-style name; when using the OTel spanmetricsconnector this is exported as traces_span_metrics_duration_seconds. Unit configurable (ms or s). |
service.name, span.name, span.kind, status.code |
Span latency distribution |
⬆️ Negative |
pkg/export/prom |
traces_spanmetrics_calls_total |
Counter |
Number of service calls, including errors (filtered by status.code). Compatible with OTel spanmetricsconnector |
service.name, span.name, span.kind, status.code |
Span call volume / error rate |
Neutral |
pkg/export/prom |
traces_spanmetrics_size_total |
Counter |
Size of service calls in bytes (Telegen extension) |
service.name, span.name, span.kind |
Request payload size |
Neutral |
pkg/export/prom |
traces_spanmetrics_response_size_total |
Counter |
Size of service responses in bytes (Telegen extension) |
service.name, span.name, span.kind |
Response payload size |
Neutral |
pkg/export/prom |
traces_target_info |
Gauge |
Target service information in trace span metric format |
service.name, service.namespace, k8s.* labels |
Service metadata |
Neutral (informational) |
pkg/export/prom |
traces_host_info |
Gauge |
Host information with constant value 1 labeled by host id |
host.id, host.name |
Host metadata |
Neutral (informational) |
pkg/export/prom |
Profiler Metrics
Metrics from continuous profiling and symbol resolution.
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
telegen_profiler_symbols_resolved_total |
Counter |
Total number of symbols resolved by source |
source (go, elf, jit, kernel, dwarf), status (resolved, unresolved) |
Symbol resolution success rate |
⬆️ Positive (more resolved = better flamegraphs) |
internal/profiler |
telegen_profiler_symbol_cache_total |
Counter |
Symbol cache hit/miss counters |
result (hit, miss) |
Cache efficiency |
Hit rate ⬆️ Positive |
internal/profiler |
telegen_profiler_symbol_resolution_duration_seconds |
Histogram |
Time spent resolving symbols |
None |
Symbol resolution performance |
⬆️ Negative (slower resolution) |
internal/profiler |
telegen_profiler_symbol_cache_size |
Gauge |
Current size of symbol cache |
None |
Memory usage indicator |
Neutral |
internal/profiler |
Node Exporter Metrics
Prometheus node_exporter compatible system metrics. Full compatibility with standard dashboards.
CPU Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
node_cpu_seconds_total |
Counter |
CPU time spent in each mode |
cpu, mode (user, system, idle, iowait, irq, softirq, steal, guest) |
CPU utilization indicator |
idle ⬇️ Negative (less idle = more load) |
internal/nodeexporter |
node_cpu_frequency_hertz |
Gauge |
Current CPU frequency in hertz |
cpu |
CPU performance state |
Neutral |
internal/nodeexporter |
Memory Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
node_memory_MemTotal_bytes |
Gauge |
Total memory in bytes |
None |
Total memory capacity |
Neutral (baseline) |
internal/nodeexporter |
node_memory_MemFree_bytes |
Gauge |
Free memory in bytes |
None |
Unused memory |
⬇️ Negative (low free memory) |
internal/nodeexporter |
node_memory_MemAvailable_bytes |
Gauge |
Available memory for allocation |
None |
Allocatable memory |
⬇️ Negative (low availability) |
internal/nodeexporter |
node_memory_Buffers_bytes |
Gauge |
Memory used for buffers |
None |
Buffer cache size |
Neutral |
internal/nodeexporter |
node_memory_Cached_bytes |
Gauge |
Memory used for page cache |
None |
Page cache size |
Neutral |
internal/nodeexporter |
node_memory_SwapTotal_bytes |
Gauge |
Total swap space |
None |
Swap capacity |
Neutral |
internal/nodeexporter |
node_memory_SwapFree_bytes |
Gauge |
Free swap space |
None |
Unused swap |
⬇️ Negative (swap pressure) |
internal/nodeexporter |
Disk Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
node_disk_reads_completed_total |
Counter |
Total read operations completed |
device |
Disk read activity |
Neutral |
internal/nodeexporter |
node_disk_writes_completed_total |
Counter |
Total write operations completed |
device |
Disk write activity |
Neutral |
internal/nodeexporter |
node_disk_read_bytes_total |
Counter |
Total bytes read from disk |
device |
Disk read throughput |
Neutral |
internal/nodeexporter |
node_disk_written_bytes_total |
Counter |
Total bytes written to disk |
device |
Disk write throughput |
Neutral |
internal/nodeexporter |
node_disk_io_time_seconds_total |
Counter |
Total time spent on I/O |
device |
Disk utilization |
⬆️ Negative (high I/O wait) |
internal/nodeexporter |
node_disk_io_now |
Gauge |
Number of I/Os in progress |
device |
Current I/O queue depth |
⬆️ Negative (I/O saturation) |
internal/nodeexporter |
Filesystem Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
node_filesystem_size_bytes |
Gauge |
Total filesystem size |
device, fstype, mountpoint |
Storage capacity |
Neutral (baseline) |
internal/nodeexporter |
node_filesystem_free_bytes |
Gauge |
Free space on filesystem |
device, fstype, mountpoint |
Available storage |
⬇️ Negative (low space) |
internal/nodeexporter |
node_filesystem_avail_bytes |
Gauge |
Space available to non-root users |
device, fstype, mountpoint |
Usable storage |
⬇️ Negative (low space) |
internal/nodeexporter |
node_filesystem_files |
Gauge |
Total inodes |
device, fstype, mountpoint |
Inode capacity |
Neutral |
internal/nodeexporter |
node_filesystem_files_free |
Gauge |
Free inodes |
device, fstype, mountpoint |
Available inodes |
⬇️ Negative (inode exhaustion) |
internal/nodeexporter |
Network Interface Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
node_network_receive_bytes_total |
Counter |
Total bytes received |
device |
Network ingress |
Neutral |
internal/nodeexporter |
node_network_transmit_bytes_total |
Counter |
Total bytes transmitted |
device |
Network egress |
Neutral |
internal/nodeexporter |
node_network_receive_packets_total |
Counter |
Total packets received |
device |
Packet ingress |
Neutral |
internal/nodeexporter |
node_network_transmit_packets_total |
Counter |
Total packets transmitted |
device |
Packet egress |
Neutral |
internal/nodeexporter |
node_network_receive_errs_total |
Counter |
Total receive errors |
device |
Receive errors |
⬆️ Negative |
internal/nodeexporter |
node_network_transmit_errs_total |
Counter |
Total transmit errors |
device |
Transmit errors |
⬆️ Negative |
internal/nodeexporter |
node_network_receive_drop_total |
Counter |
Total receive drops |
device |
Packet drops (RX) |
⬆️ Negative |
internal/nodeexporter |
node_network_transmit_drop_total |
Counter |
Total transmit drops |
device |
Packet drops (TX) |
⬆️ Negative |
internal/nodeexporter |
Load Average Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
node_load1 |
Gauge |
1-minute load average |
None |
Short-term load |
⬆️ Negative (above CPU count) |
internal/nodeexporter |
node_load5 |
Gauge |
5-minute load average |
None |
Medium-term load |
⬆️ Negative (above CPU count) |
internal/nodeexporter |
node_load15 |
Gauge |
15-minute load average |
None |
Long-term load |
⬆️ Negative (above CPU count) |
internal/nodeexporter |
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
node_boot_time_seconds |
Gauge |
Node boot time (unix timestamp) |
None |
System uptime reference |
Neutral |
internal/nodeexporter |
node_context_switches_total |
Counter |
Total context switches |
None |
Scheduler activity |
⬆️ Negative (high = CPU contention) |
internal/nodeexporter |
node_forks_total |
Counter |
Total process forks |
None |
Process creation rate |
⬆️ Neutral (depends on workload) |
internal/nodeexporter |
node_procs_running |
Gauge |
Running processes |
None |
Active processes |
⬆️ Negative (saturation) |
internal/nodeexporter |
node_procs_blocked |
Gauge |
Blocked processes |
None |
I/O blocked processes |
⬆️ Negative (I/O bottleneck) |
internal/nodeexporter |
node_uname_info |
Gauge |
System information |
sysname, release, version, machine, nodename |
System metadata |
Neutral (informational) |
internal/nodeexporter |
Kubernetes State Metrics
kube-state-metrics compatible metrics for Kubernetes object states.
Pod Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
kube_pod_info |
Info |
Information about pod |
namespace, pod, node, host_ip, pod_ip, created_by_kind, created_by_name, uid |
Pod metadata |
Neutral (informational) |
internal/kubestate |
kube_pod_start_time |
Gauge |
Start time in unix timestamp |
namespace, pod, uid |
Pod age reference |
Neutral |
internal/kubestate |
kube_pod_owner |
Info |
Pod owner information |
namespace, pod, owner_kind, owner_name, owner_is_controller |
Ownership metadata |
Neutral (informational) |
internal/kubestate |
kube_pod_status_phase |
Gauge |
Pod current phase (1 = active) |
namespace, pod, uid, phase (Pending, Running, Succeeded, Failed, Unknown) |
Pod lifecycle state |
non-Running ⬆️ Negative |
internal/kubestate |
kube_pod_status_ready |
Gauge |
Pod ready to serve (1 = ready) |
namespace, pod, uid, condition |
Pod readiness |
0 = Negative (not ready) |
internal/kubestate |
kube_pod_status_scheduled |
Gauge |
Pod scheduled status |
namespace, pod, uid, condition |
Scheduling state |
0 = Negative (unscheduled) |
internal/kubestate |
kube_pod_container_status_restarts_total |
Counter |
Container restart count |
namespace, pod, uid, container |
Container stability |
⬆️ Negative (restarts = instability) |
internal/kubestate |
kube_pod_container_status_running |
Gauge |
Container is running (1 = true) |
namespace, pod, uid, container |
Container state |
0 = Negative (not running) |
internal/kubestate |
kube_pod_container_status_waiting |
Gauge |
Container is waiting (1 = true) |
namespace, pod, uid, container |
Container state |
1 = Negative (stuck) |
internal/kubestate |
kube_pod_container_status_terminated |
Gauge |
Container is terminated (1 = true) |
namespace, pod, uid, container |
Container state |
Depends on exit code |
internal/kubestate |
kube_pod_container_resource_requests |
Gauge |
Container resource requests |
namespace, pod, container, resource (cpu, memory), unit |
Resource allocation |
Neutral |
internal/kubestate |
kube_pod_container_resource_limits |
Gauge |
Container resource limits |
namespace, pod, container, resource (cpu, memory), unit |
Resource constraints |
Neutral |
internal/kubestate |
Deployment Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
kube_deployment_status_replicas |
Gauge |
Number of replicas per deployment |
namespace, deployment |
Current replica count |
Neutral |
internal/kubestate |
kube_deployment_status_replicas_available |
Gauge |
Available replicas |
namespace, deployment |
Ready replicas |
⬇️ Negative (below desired) |
internal/kubestate |
kube_deployment_status_replicas_unavailable |
Gauge |
Unavailable replicas |
namespace, deployment |
Failing replicas |
⬆️ Negative |
internal/kubestate |
kube_deployment_spec_replicas |
Gauge |
Desired number of replicas |
namespace, deployment |
Target replica count |
Neutral (baseline) |
internal/kubestate |
Node Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
kube_node_info |
Info |
Node information |
node, kernel_version, os_image, container_runtime_version, kubelet_version |
Node metadata |
Neutral (informational) |
internal/kubestate |
kube_node_status_condition |
Gauge |
Node condition status |
node, condition (Ready, MemoryPressure, DiskPressure, PIDPressure), status |
Node health |
non-Ready ⬆️ Negative |
internal/kubestate |
kube_node_status_allocatable |
Gauge |
Allocatable resources |
node, resource (cpu, memory, pods) |
Available capacity |
Neutral |
internal/kubestate |
kube_node_status_capacity |
Gauge |
Total node capacity |
node, resource (cpu, memory, pods) |
Total capacity |
Neutral (baseline) |
internal/kubestate |
StatefulSet, DaemonSet, ReplicaSet Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
kube_statefulset_replicas |
Gauge |
Desired StatefulSet replicas |
namespace, statefulset |
Target replicas |
Neutral |
internal/kubestate |
kube_statefulset_status_replicas_ready |
Gauge |
Ready StatefulSet replicas |
namespace, statefulset |
Available replicas |
⬇️ Negative (below desired) |
internal/kubestate |
kube_daemonset_status_desired_number_scheduled |
Gauge |
Desired DaemonSet pods |
namespace, daemonset |
Target pods |
Neutral |
internal/kubestate |
kube_daemonset_status_number_ready |
Gauge |
Ready DaemonSet pods |
namespace, daemonset |
Available pods |
⬇️ Negative (below desired) |
internal/kubestate |
kube_replicaset_status_replicas |
Gauge |
Current ReplicaSet replicas |
namespace, replicaset |
Running replicas |
Neutral |
internal/kubestate |
kube_replicaset_status_ready_replicas |
Gauge |
Ready ReplicaSet replicas |
namespace, replicaset |
Available replicas |
⬇️ Negative (below current) |
internal/kubestate |
HPA Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
kube_horizontalpodautoscaler_status_current_replicas |
Gauge |
Current HPA replicas |
namespace, horizontalpodautoscaler |
Actual scale |
Neutral |
internal/kubestate |
kube_horizontalpodautoscaler_spec_min_replicas |
Gauge |
Minimum HPA replicas |
namespace, horizontalpodautoscaler |
Scale floor |
Neutral |
internal/kubestate |
kube_horizontalpodautoscaler_spec_max_replicas |
Gauge |
Maximum HPA replicas |
namespace, horizontalpodautoscaler |
Scale ceiling |
Neutral |
internal/kubestate |
GPU / AI-ML Metrics
NVIDIA GPU metrics for AI/ML workload observability.
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
gpu_kernel_launch_calls_total |
Counter |
Number of GPU kernel launches |
cuda.kernel.name, service.name |
GPU compute activity |
Neutral |
pkg/export/prom |
gpu_memory_allocations_bytes_total |
Counter |
Amount of GPU allocated memory in bytes |
service.name |
GPU memory usage |
Neutral |
pkg/export/prom |
gpu_kernel_grid_size_total |
Histogram |
Number of blocks in the GPU kernel grid |
service.name |
Kernel parallelism |
Neutral |
pkg/export/prom |
gpu_kernel_block_size_total |
Histogram |
Number of threads in the GPU kernel block |
service.name |
Thread parallelism |
Neutral |
pkg/export/prom |
gpu_memory_copies_bytes_total |
Histogram |
Amount of GPU to/from memory copies |
cuda.memcpy.kind, service.name |
Data transfer volume |
Neutral |
pkg/export/prom |
DNS Metrics
DNS lookup metrics from eBPF tracing.
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
dns_lookup_duration_seconds |
Histogram |
Time taken to perform a DNS lookup |
dns.question.name, service.name |
DNS resolution latency |
⬆️ Negative (slow DNS) |
pkg/export/prom |
Self-Telemetry Metrics
Internal metrics about Telegen’s own operation.
Signal Collection Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
telegen_signals_collected_total |
Counter |
Total signals collected by type and collector |
signal_type (traces, metrics, logs, profiles), collector |
Collection throughput |
⬆️ Positive (healthy collection) |
internal/pipeline |
telegen_signals_dropped_total |
Counter |
Total signals dropped by type and reason |
signal_type, reason (queue_full, rate_limited, filter_dropped) |
Data loss indicator |
⬆️ Negative (data loss) |
internal/pipeline |
Queue Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
telegen_queue_size |
Gauge |
Current queue size by signal type |
signal_type |
Buffer utilization |
⬆️ Negative (backpressure) |
internal/pipeline |
telegen_queue_dropped_total |
Counter |
Total items dropped from queue |
signal_type, reason |
Queue overflow |
⬆️ Negative |
internal/pipeline |
telegen_queue_latency_seconds |
Histogram |
Queue wait time in seconds |
signal_type |
Queue delay |
⬆️ Negative |
internal/pipeline |
Export Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
telegen_export_success_total |
Counter |
Successful exports by endpoint and signal type |
endpoint, signal_type |
Export success rate |
⬆️ Positive |
internal/pipeline |
telegen_export_failure_total |
Counter |
Failed exports by endpoint and signal type |
endpoint, signal_type |
Export failures |
⬆️ Negative |
internal/pipeline |
telegen_export_latency_seconds |
Histogram |
Export latency in seconds |
endpoint, signal_type |
Export performance |
⬆️ Negative |
internal/pipeline |
telegen_export_retries_total |
Counter |
Export retry attempts by endpoint |
endpoint |
Retry pressure |
⬆️ Negative |
internal/pipeline |
Circuit Breaker Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
telegen_circuit_breaker_state |
Gauge |
Circuit breaker state (0=closed, 1=half-open, 2=open) |
endpoint |
Export health |
open state = Negative |
internal/pipeline |
telegen_circuit_breaker_trips_total |
Counter |
Circuit breaker trip count by endpoint |
endpoint |
Export failures |
⬆️ Negative |
internal/pipeline |
telegen_circuit_breaker_recoveries_total |
Counter |
Circuit breaker recovery count by endpoint |
endpoint |
Export recovery |
⬆️ Positive |
internal/pipeline |
eBPF Internal Metrics
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
obi_ebpf_tracer_flushes |
Histogram |
Length of groups of traces flushed from eBPF tracer |
None |
Batch size distribution |
Neutral |
pkg/export/imetrics |
obi_otel_metric_exports_total |
Counter |
Length of metric batches submitted to OTEL collector |
None |
Export throughput |
⬆️ Positive |
pkg/export/imetrics |
obi_otel_metric_export_errors_total |
Counter |
Error count on each failed OTEL metric export |
error |
Export errors |
⬆️ Negative |
pkg/export/imetrics |
obi_otel_trace_exports_total |
Counter |
Length of trace batches submitted to OTEL collector |
None |
Export throughput |
⬆️ Positive |
pkg/export/imetrics |
obi_otel_trace_export_errors_total |
Counter |
Error count on each failed OTEL trace export |
error |
Export errors |
⬆️ Negative |
pkg/export/imetrics |
obi_instrumented_processes |
Gauge |
Total number of instrumented processes |
process_name |
Instrumentation coverage |
⬆️ Positive |
pkg/export/imetrics |
obi_instrumentation_errors_total |
Counter |
Total instrumentation errors by process name |
process_name, error_type |
Instrumentation failures |
⬆️ Negative |
pkg/export/imetrics |
obi_bpf_probe_latency_seconds |
Histogram |
Latency of BPF probes in seconds |
probe_id, probe_type, probe_name |
eBPF overhead |
⬆️ Negative |
pkg/export/imetrics |
obi_bpf_map_entries_total |
Gauge |
Current entries in BPF maps |
map_id, map_name, map_type |
Map utilization |
⬆️ Negative (if near max) |
pkg/export/imetrics |
obi_bpf_map_max_entries_total |
Gauge |
Maximum entries in BPF maps |
map_id, map_name, map_type |
Map capacity |
Neutral (baseline) |
pkg/export/imetrics |
obi_kube_cache_forward_lag_seconds |
Histogram |
Time since K8s event until forwarded to subscribers |
None |
Kubernetes metadata freshness |
⬆️ Negative |
pkg/export/imetrics |
| Metric Name |
Type |
Definition |
Labels |
Impact |
Sentiment |
Source Module |
obi_internal_build_info |
Gauge |
Constant value 1 labeled with build information |
goarch, goos, goversion, version, revision |
Build metadata |
Neutral (informational) |
pkg/export/imetrics |
obi_build_info |
Gauge |
Telegen build information with service language |
goarch, goos, goversion, version, revision, telemetry_sdk_language |
Build + service metadata |
Neutral (informational) |
pkg/export/prom |
target_info |
Gauge |
Attributes associated to a given monitored entity |
service.name, service.namespace, k8s.* labels |
Entity metadata |
Neutral (informational) |
pkg/export/prom |
Sentiment Value Reference
Understanding metric sentiments:
| Symbol |
Meaning |
| ⬆️ Positive |
Increase indicates healthy/good behavior |
| ⬆️ Negative |
Increase indicates problem/degradation |
| ⬇️ Positive |
Decrease indicates healthy/good behavior |
| ⬇️ Negative |
Decrease indicates problem/degradation |
| Neutral |
No inherent good/bad sentiment, depends on context |
Next Steps