指標 API¶
指標 API 是一個 HTTP API,用於以 Prometheus 格式獲取指標。 它監聽 8082 埠,並且預設只能從 localhost 存取。 若要變更預設設定,請參閱TorchServe 組態。 指標端點預設為啟用,並且在 metrics_mode 組態設定為 prometheus
時,會傳回 Prometheus 格式的指標。 您可以使用 curl 請求查詢指標,或將 Prometheus 伺服器指向端點,並使用 Grafana 建立儀表板。
預設情況下,這些 API 是啟用的,但是可以透過在 torchserve config.properties 檔案中設定 enable_metrics_api=false
來停用。 有關詳細資訊,請參閱 Torchserve 組態 文件。
注意 不要與 torch serve 的自訂指標 API混淆。 自訂指標 API 用於根據設定的 metrics_mode
(log 或 prometheus) 收集自訂後端指標。 有關此 API 的更多資訊,請參閱此處。
curl http://127.0.0.1:8082/metrics
# HELP Requests5XX Torchserve prometheus counter metric with unit: Count
# TYPE Requests5XX counter
# HELP DiskUsage Torchserve prometheus gauge metric with unit: Gigabytes
# TYPE DiskUsage gauge
DiskUsage{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 20.054508209228516
# HELP GPUUtilization Torchserve prometheus gauge metric with unit: Percent
# TYPE GPUUtilization gauge
# HELP PredictionTime Torchserve prometheus gauge metric with unit: ms
# TYPE PredictionTime gauge
PredictionTime{ModelName="resnet18",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 83.13
# HELP WorkerLoadTime Torchserve prometheus gauge metric with unit: Milliseconds
# TYPE WorkerLoadTime gauge
WorkerLoadTime{WorkerName="W-9000-resnet18_1.0",Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 4593.0
WorkerLoadTime{WorkerName="W-9001-resnet18_1.0",Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 4592.0
# HELP MemoryAvailable Torchserve prometheus gauge metric with unit: Megabytes
# TYPE MemoryAvailable gauge
MemoryAvailable{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 5829.7421875
# HELP GPUMemoryUsed Torchserve prometheus gauge metric with unit: Megabytes
# TYPE GPUMemoryUsed gauge
# HELP ts_inference_requests_total Torchserve prometheus counter metric with unit: Count
# TYPE ts_inference_requests_total counter
ts_inference_requests_total{model_name="resnet18",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 3.0
# HELP GPUMemoryUtilization Torchserve prometheus gauge metric with unit: Percent
# TYPE GPUMemoryUtilization gauge
# HELP HandlerTime Torchserve prometheus gauge metric with unit: ms
# TYPE HandlerTime gauge
HandlerTime{ModelName="resnet18",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 82.93
# HELP ts_inference_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
# TYPE ts_inference_latency_microseconds counter
ts_inference_latency_microseconds{model_name="resnet18",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 290371.129
# HELP CPUUtilization Torchserve prometheus gauge metric with unit: Percent
# TYPE CPUUtilization gauge
CPUUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 0.0
# HELP MemoryUsed Torchserve prometheus gauge metric with unit: Megabytes
# TYPE MemoryUsed gauge
MemoryUsed{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 8245.62109375
# HELP QueueTime Torchserve prometheus gauge metric with unit: Milliseconds
# TYPE QueueTime gauge
QueueTime{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 0.0
# HELP ts_queue_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
# TYPE ts_queue_latency_microseconds counter
ts_queue_latency_microseconds{model_name="resnet18",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 365.21
# HELP DiskUtilization Torchserve prometheus gauge metric with unit: Percent
# TYPE DiskUtilization gauge
DiskUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 5.8
# HELP Requests2XX Torchserve prometheus counter metric with unit: Count
# TYPE Requests2XX counter
Requests2XX{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 8.0
# HELP Requests4XX Torchserve prometheus counter metric with unit: Count
# TYPE Requests4XX counter
# HELP WorkerThreadTime Torchserve prometheus gauge metric with unit: Milliseconds
# TYPE WorkerThreadTime gauge
WorkerThreadTime{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 1.0
# HELP DiskAvailable Torchserve prometheus gauge metric with unit: Gigabytes
# TYPE DiskAvailable gauge
DiskAvailable{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 325.05113983154297
# HELP MemoryUtilization Torchserve prometheus gauge metric with unit: Percent
# TYPE MemoryUtilization gauge
MemoryUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 64.4
curl "http://127.0.0.1:8082/metrics?name[]=ts_inference_latency_microseconds&name[]=ts_queue_latency_microseconds" --globoff
# HELP ts_queue_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
# TYPE ts_queue_latency_microseconds counter
ts_queue_latency_microseconds{model_name="resnet18",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 365.21
# HELP ts_inference_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
# TYPE ts_inference_latency_microseconds counter
ts_inference_latency_microseconds{model_name="resnet18",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 290371.129
Prometheus 伺服器¶
若要在 Prometheus 伺服器上檢視這些指標,請使用此處的指示下載並安裝。 如下所示建立一個最小的 prometheus.yml
組態檔案,並執行 ./prometheus --config.file=prometheus.yml
。
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'torchserve'
static_configs:
- targets: ['localhost:8082'] #TorchServe metrics endpoint
在瀏覽器上導覽至 https://127.0.0.1:9090/
以執行查詢並建立圖表
data:image/s3,"s3://crabby-images/96faf/96fafc0053e5f3674d332d7fce71e94a344c16d3" alt="Prometheus Server"
Grafana¶
一旦您執行 Torchserve 和 Prometheus 伺服器,您可以進一步設定 Grafana,將其指向 Prometheus 伺服器,並導覽至 https://127.0.0.1:3000/
以建立儀表板和圖表。
您可以使用以下命令來啟動 Grafana - sudo systemctl daemon-reload && sudo systemctl enable grafana-server && sudo systemctl start grafana-server
data:image/s3,"s3://crabby-images/755d1/755d1a967d16f3e9ff277b748413b2d3e4704d79" alt="Grafana Dashboard"