torch_tensorrt.runtime¶

函數¶

torch_tensorrt.runtime.set_multi_device_safe_mode(mode: bool) → _MultiDeviceSafeModeContextManager[來源]¶

將執行階段 (僅限 Python 且為預設) 設定為多裝置安全模式

如果系統上有多個裝置可用，為了讓執行階段安全執行，則必須進行額外的裝置檢查。這些檢查可能會對效能產生影響，因此預設為停用。用於抑制在多裝置環境中不安全執行的警告。

參數: mode (bool) – 啟用 (True) 或停用 (False) 多裝置檢查

範例

with torch_tensorrt.runtime.set_multi_device_safe_mode(True):
    results = trt_compiled_module(*inputs)

類別¶

class torch_tensorrt.runtime.TorchTensorRTModule(**kwargs: Dict[str, Any])[來源]¶

TorchTensorRTModule 是一個 PyTorch 模組，包含任意 TensorRT 引擎。

此模組由 Torch-TensorRT 執行階段支援，並且完全相容於 FX / Python 部署 (只需在應用程式中 import torch_tensorrt) 以及 TorchScript / C++ 部署，因為 TorchTensorRTModule 可以傳遞至 torch.jit.trace 然後儲存。

forward 函數很簡單，就是 forward(*args: torch.Tensor) -> Tuple[torch.Tensor]，其中內部實作是 return Tuple(torch.ops.tensorrt.execute_engine(list(inputs), self.engine))

> 注意：TorchTensorRTModule 僅支援使用明確批次建置的引擎

變數

name (str) – 模組名稱 (方便偵錯)
engine (torch.classes.tensorrt.Engine) – Torch-TensorRT TensorRT 引擎實例，管理 [反]序列化、裝置組態、效能分析
input_binding_names (List[str]) – 輸入 TensorRT 引擎繫結名稱的清單，依序傳遞至 TRT 模組
output_binding_names (List[str]) – 輸出 TensorRT 引擎繫結名稱的清單，應依序傳回

__init__(**kwargs: Dict[str, Any]) → Any¶: 初始化內部模組狀態，由 nn.Module 和 ScriptModule 共用。

forward(**kwargs: Dict[str, Any]) → Any¶

定義每次呼叫時執行的計算。

應由所有子類別覆寫。

注意

雖然 forward pass 的配方需要在這個函數內定義，但應在之後呼叫 Module 實例，而不是這個函數，因為前者會處理已註冊的 hook，而後者會靜默地忽略它們。

get_extra_state(**kwargs: Dict[str, Any]) → Any¶

傳回任何要包含在模組 state_dict 中的額外狀態。

如果您需要儲存額外狀態，請為您的模組實作此函數和對應的 set_extra_state()。在建置模組的 state_dict() 時會呼叫此函數。

請注意，額外狀態應可 pickle，以確保 state_dict 的序列化能正常運作。我們僅為序列化 Tensor 提供回溯相容性保證；如果其他物件的序列化 pickle 形式變更，則可能會破壞回溯相容性。

傳回: 任何要儲存在模組 state_dict 中的額外狀態
傳回類型: object

set_extra_state(**kwargs: Dict[str, Any]) → Any¶

設定載入的 state_dict 中包含的額外狀態。

此函數從 load_state_dict() 呼叫，以處理在 state_dict 中找到的任何額外狀態。如果您需要在模組的 state_dict 中儲存額外狀態，請實作此函數和對應的 get_extra_state()。

參數: state (dict) – 來自 state_dict 的額外狀態

class torch_tensorrt.runtime.PythonTorchTensorRTModule(serialized_engine: ~typing.Optional[bytes] = None, input_binding_names: ~typing.Optional[~typing.List[str]] = None, output_binding_names: ~typing.Optional[~typing.List[str]] = None, *, name: str = '', settings: ~torch_tensorrt.dynamo._settings.CompilationSettings = CompilationSettings(enabled_precisions={<dtype.f32: 7>}, debug=False, workspace_size=0, min_block_size=5, torch_executed_ops=set(), pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False, device=Device(type=DeviceType.GPU, gpu_id=0), require_full_compilation=False, disable_tf32=False, assume_dynamic_shape_support=False, sparse_weights=False, engine_capability=<EngineCapability.STANDARD: 1>, num_avg_timing_iters=1, dla_sram_size=1048576, dla_local_dram_size=1073741824, dla_global_dram_size=536870912, dryrun=False, hardware_compatible=False, timing_cache_path='/tmp/torch_tensorrt_engine_cache/timing_cache.bin', lazy_engine_init=False, cache_built_engines=False, reuse_cached_engines=False, use_explicit_typing=False, use_fp32_acc=False, refit_identical_engine_weights=False, strip_engine_weights=False, immutable_weights=True, enable_weight_streaming=False, enable_cross_compile_for_windows=False, use_aot_joint_export=True), weight_name_map: ~typing.Optional[dict[typing.Any, typing.Any]] = None)[來源]¶

PythonTorchTensorRTModule 是一個 PyTorch 模組，包含任意 TensorRT 引擎。

此模組由 Torch-TensorRT 執行階段支援，且僅相容於 FX / Dynamo / Python 部署。此模組無法透過 torch.jit.trace 序列化為 torchscript，以進行 C++ 部署。

__init__(serialized_engine: ~typing.Optional[bytes] = None, input_binding_names: ~typing.Optional[~typing.List[str]] = None, output_binding_names: ~typing.Optional[~typing.List[str]] = None, *, name: str = '', settings: ~torch_tensorrt.dynamo._settings.CompilationSettings = CompilationSettings(enabled_precisions={<dtype.f32: 7>}, debug=False, workspace_size=0, min_block_size=5, torch_executed_ops=set(), pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False, device=Device(type=DeviceType.GPU, gpu_id=0), require_full_compilation=False, disable_tf32=False, assume_dynamic_shape_support=False, sparse_weights=False, engine_capability=<EngineCapability.STANDARD: 1>, num_avg_timing_iters=1, dla_sram_size=1048576, dla_local_dram_size=1073741824, dla_global_dram_size=536870912, dryrun=False, hardware_compatible=False, timing_cache_path='/tmp/torch_tensorrt_engine_cache/timing_cache.bin', lazy_engine_init=False, cache_built_engines=False, reuse_cached_engines=False, use_explicit_typing=False, use_fp32_acc=False, refit_identical_engine_weights=False, strip_engine_weights=False, immutable_weights=True, enable_weight_streaming=False, enable_cross_compile_for_windows=False, use_aot_joint_export=True), weight_name_map: ~typing.Optional[dict[typing.Any, typing.Any]] = None)[來源]¶

接受名稱、目標裝置、序列化的 TensorRT 引擎和繫結名稱/順序，並圍繞它建構 PyTorch torch.nn.Module。使用 TensorRT Python API 執行引擎

參數

serialized_engine (bytes) – 位元組陣列形式的序列化 TensorRT 引擎
input_binding_names (List[str]) – 輸入 TensorRT 引擎繫結名稱的清單，依序傳遞至 TRT 模組
output_binding_names (List[str]) – 輸出 TensorRT 引擎繫結名稱的清單，應依序傳回

關鍵字引數

name (str) – 模組名稱
settings (CompilationSettings) – 用於編譯引擎的設定，如果未傳遞物件，則假設引擎是使用預設編譯設定建置的
weight_name_map (dict) – 引擎權重名稱到 state_dict 權重名稱的對應

範例

trt_module = PythonTorchTensorRTModule(
    engine_str,
    input_binding_names=["x"],
    output_binding_names=["output"],
    name="my_module",
    settings=CompilationSettings(device=torch.cuda.current_device)
)

disable_profiling() → None[來源]¶: 停用 TensorRT 效能分析。

enable_profiling(profiler: IProfiler = None) → None[來源]¶: 啟用 TensorRT 效能分析。在呼叫此函數後，TensorRT 將在每次 forward 執行時，於 stdout 中報告在每個層上花費的時間。

forward(*inputs: Tensor) → Union[Tensor, Tuple[Tensor, ...]][來源]¶

定義每次呼叫時執行的計算。

應由所有子類別覆寫。

注意

雖然 forward pass 的配方需要在這個函數內定義，但應在之後呼叫 Module 實例，而不是這個函數，因為前者會處理已註冊的 hook，而後者會靜默地忽略它們。

get_layer_info() → str[來源]¶: 取得引擎的層資訊。僅支援 TRT > 8.2。

validate_input_shapes(inputs: Sequence[Tensor]) → bool[來源]¶: 驗證 forward 函數的輸入形狀是否已變更

torch_tensorrt.runtime¶

函數¶

類別¶

文件

教學

資源