NVIDIA GPU metrics when AI/ML observability is enabled.
Metric
Type
Labels
Description
telegen_gpu_utilization_ratio
Gauge
gpu, uuid
GPU utilization (0-1)
telegen_gpu_memory_used_bytes
Gauge
gpu, uuid
Memory used
telegen_gpu_memory_total_bytes
Gauge
gpu, uuid
Total memory
telegen_gpu_memory_utilization_ratio
Gauge
gpu, uuid
Memory utilization (0-1)
telegen_gpu_temperature_celsius
Gauge
gpu, uuid
GPU temperature
telegen_gpu_power_watts
Gauge
gpu, uuid
Power usage
telegen_gpu_power_limit_watts
Gauge
gpu, uuid
Power limit
telegen_gpu_clock_graphics_hertz
Gauge
gpu, uuid
Graphics clock
telegen_gpu_clock_sm_hertz
Gauge
gpu, uuid
SM clock
telegen_gpu_clock_memory_hertz
Gauge
gpu, uuid
Memory clock
telegen_gpu_pcie_tx_bytes_total
Counter
gpu, uuid
PCIe TX bytes
telegen_gpu_pcie_rx_bytes_total
Counter
gpu, uuid
PCIe RX bytes
telegen_gpu_ecc_errors_total
Counter
gpu, uuid, type
ECC errors
telegen_gpu_nvlink_tx_bytes_total
Counter
gpu, uuid, link
NVLink TX
telegen_gpu_nvlink_rx_bytes_total
Counter
gpu, uuid, link
NVLink RX
telegen_gpu_compute_processes
Gauge
gpu, uuid
Compute processes
telegen_gpu_graphics_processes
Gauge
gpu, uuid
Graphics processes
LLM Inference Metrics
Metric
Type
Labels
Description
telegen_llm_requests_total
Counter
model, endpoint
Inference requests
telegen_llm_tokens_input_total
Counter
model
Input tokens
telegen_llm_tokens_output_total
Counter
model
Output tokens
telegen_llm_time_to_first_token_seconds
Histogram
model
TTFT latency
telegen_llm_tokens_per_second
Gauge
model
Token generation rate
telegen_llm_batch_size
Histogram
model
Batch sizes
telegen_llm_cache_hit_ratio
Gauge
model
KV cache hit ratio
telegen_llm_queue_depth
Gauge
model
Request queue
Network Flow Metrics
Metric
Type
Labels
Description
telegen_flow_bytes_total
Counter
src, dst, protocol, direction
Flow bytes
telegen_flow_packets_total
Counter
src, dst, protocol, direction
Flow packets
telegen_flow_connections_total
Counter
protocol
Connection count
telegen_flow_active_connections
Gauge
protocol
Active connections
telegen_flow_rtt_seconds
Histogram
src, dst
Round-trip time
telegen_flow_retransmits_total
Counter
src, dst
Retransmissions
Database Metrics
Metric
Type
Labels
Description
telegen_db_queries_total
Counter
db_type, operation
Query count
telegen_db_query_duration_seconds
Histogram
db_type, operation
Query latency
telegen_db_connections_active
Gauge
db_type, host
Active connections
telegen_db_errors_total
Counter
db_type, error_type
Database errors
telegen_db_rows_affected_total
Counter
db_type, operation
Rows affected
Connection Statistics Metrics
Emitted when a TCP connection closes, providing per-connection byte throughput data.
Metric
Type
Labels
Description
telegen.connection.bytes_sent
Counter
src_ip, dst_ip, dst_port, protocol
Bytes sent over the connection lifetime
telegen.connection.bytes_received
Counter
src_ip, dst_ip, dst_port, protocol
Bytes received over the connection lifetime
These metrics complement per-request traces, giving aggregate throughput even for protocols that are not fully parsed.
Kafka Consumer Group Metrics
Kafka spans with consumer group context include the following additional span attribute:
Attribute
Description
messaging.kafka.consumer.group.id
Consumer group identifier, extracted from JoinGroup and SyncGroup Kafka protocol events
This attribute appears on spans emitted for Fetch requests and group management operations (JoinGroup, SyncGroup). Use it to filter group-specific traces and correlate consumer lag:
# Count Fetch spans by consumer group
count(telegen_spans_collected_total{
messaging_system="kafka",
messaging_kafka_consumer_group_id=~".+"
}) by (messaging_kafka_consumer_group_id)
SNMP Metrics
SNMP metrics use the MIB object names with snmp_ prefix.
Interface Metrics (IF-MIB)
Metric
Type
Labels
Description
snmp_ifHCInOctets
Counter
ifIndex, ifDescr
Input octets
snmp_ifHCOutOctets
Counter
ifIndex, ifDescr
Output octets
snmp_ifHCInUcastPkts
Counter
ifIndex, ifDescr
Input packets
snmp_ifHCOutUcastPkts
Counter
ifIndex, ifDescr
Output packets
snmp_ifOperStatus
Gauge
ifIndex, ifDescr
Operational status
snmp_ifHighSpeed
Gauge
ifIndex, ifDescr
Interface speed
snmp_ifInErrors
Counter
ifIndex, ifDescr
Input errors
snmp_ifOutErrors
Counter
ifIndex, ifDescr
Output errors
System Metrics (SNMPv2-MIB)
Metric
Type
Labels
Description
snmp_sysUpTime
Gauge
-
System uptime
snmp_sysName
Info
sysName
System name
snmp_sysDescr
Info
sysDescr
System description
Storage Array Metrics
Common Storage Metrics
Metric
Type
Labels
Description
telegen_storage_capacity_bytes
Gauge
array, pool
Total capacity
telegen_storage_used_bytes
Gauge
array, pool
Used capacity
telegen_storage_free_bytes
Gauge
array, pool
Free capacity
telegen_storage_iops_read
Counter
array, volume
Read IOPS
telegen_storage_iops_write
Counter
array, volume
Write IOPS
telegen_storage_throughput_read_bytes
Counter
array, volume
Read throughput
telegen_storage_throughput_write_bytes
Counter
array, volume
Write throughput
telegen_storage_latency_read_seconds
Histogram
array, volume
Read latency
telegen_storage_latency_write_seconds
Histogram
array, volume
Write latency
telegen_storage_controller_status
Gauge
array, controller
Controller health
telegen_storage_disk_status
Gauge
array, disk
Disk health
Security Metrics
Metric
Type
Labels
Description
telegen_security_events_total
Counter
event_type, severity
Security events
telegen_security_syscall_total
Counter
syscall, comm
Syscall counts
telegen_security_file_access_total
Counter
path, operation
File access
telegen_security_process_exec_total
Counter
binary
Process executions
telegen_security_network_connections_total
Counter
process, direction
Network connections
telegen_security_privilege_escalation_total
Counter
type
Privilege escalations
Kubernetes Metrics
Metric
Type
Labels
Description
telegen_k8s_pods_discovered
Gauge
namespace
Pods discovered
telegen_k8s_services_discovered
Gauge
namespace
Services discovered
telegen_k8s_deployments_discovered
Gauge
namespace
Deployments discovered
Metric Labels
Common Labels
Applied to most metrics:
Label
Description
Example
host.name
Hostname
node-1
service.name
Service name
my-service
service.namespace
Namespace
production
k8s.pod.name
Pod name
my-pod-abc123
k8s.namespace.name
K8s namespace
default
k8s.node.name
K8s node
node-1
k8s.deployment.name
Deployment
my-deployment
container.id
Container ID
abc123...
Metric Naming Conventions
Telegen follows these conventions:
Prefix: telegen_ for Telegen-specific metrics
node_exporter: node_ prefix for compatibility
SNMP: snmp_ prefix with MIB object names
Units: Suffix with unit (_bytes, _seconds, _total)
We value your privacy
We use Google Analytics to understand how our site is used, which helps us improve our platform and documentation. If you accept, your visit is tracked anonymously. If you decline, we won't track you at all.