ainode_uptime_seconds | counter | Seconds since process start |
ainode_requests_total | counter | Total inference requests |
ainode_request_errors_total | counter | Failed requests |
ainode_tokens_generated_total | counter | Total tokens generated |
ainode_tokens_per_second | gauge | Average throughput |
ainode_request_latency_milliseconds{quantile} | summary | P50/P95/P99 latency |
ainode_requests_by_model_total{model} | counter | Per-model request counts |
ainode_gpu_utilization_percent | gauge | GPU utilization 0–100 |
ainode_gpu_memory_used_bytes | gauge | GPU memory in use |
ainode_gpu_memory_total_bytes | gauge | Total GPU memory |
ainode_gpu_temperature_celsius | gauge | GPU temperature |
ainode_build_info{version} | gauge | Always 1; carries version label |