torch.backends¶

torch.backends 控制 PyTorch 支援的各種後端的行為。

這些後端包括

torch.backends.cpu
torch.backends.cuda
torch.backends.cudnn
torch.backends.cusparselt
torch.backends.mha
torch.backends.mps
torch.backends.mkl
torch.backends.mkldnn
torch.backends.nnpack
torch.backends.openmp
torch.backends.opt_einsum
torch.backends.xeon

torch.backends.cpu¶

torch.backends.cpu.get_cpu_capability()[原始碼][原始碼]¶

以字串值傳回 CPU 功能。

可能的值： - “DEFAULT” - “VSX” - “Z VECTOR” - “NO AVX” - “AVX2” - “AVX512” - “SVE256”

傳回類型: str

torch.backends.cuda¶

torch.backends.cuda.is_built()[原始碼][原始碼]¶

傳回 PyTorch 是否建置了 CUDA 支援。

請注意，這並不一定表示 CUDA 可用；僅表示如果此 PyTorch 二進位檔在具有可用的 CUDA 驅動程式和裝置的機器上執行，我們就可以使用它。

torch.backends.cuda.matmul.allow_tf32¶: 一個 bool，用於控制是否可以在 Ampere 或更新的 GPU 上的矩陣乘法中使用 TensorFloat-32 張量核心。請參閱 Ampere (以及更新版本) 上的 TensorFloat-32 (TF32)。

torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction¶: 一個 bool，用於控制是否允許在 fp16 GEMM 中使用降低精度的歸約（例如，使用 fp16 累加類型）。

torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction¶: 一個 bool，用於控制是否允許在 bf16 GEMM 中使用降低精度的歸約。

torch.backends.cuda.cufft_plan_cache¶

cufft_plan_cache 包含每個 CUDA 裝置的 cuFFT 計畫快取。通過 torch.backends.cuda.cufft_plan_cache[i] 查詢特定裝置 i 的快取。

torch.backends.cuda.cufft_plan_cache.size¶: 一個唯讀的 int，顯示目前在 cuFFT 計畫快取中的計畫數量。

torch.backends.cuda.cufft_plan_cache.max_size¶: 一個 int，用於控制 cuFFT 計畫快取的容量。

torch.backends.cuda.cufft_plan_cache.clear()¶: 清除 cuFFT 計畫快取。

torch.backends.cuda.preferred_blas_library(backend=None)[原始碼][原始碼]¶

覆寫 PyTorch 用於 BLAS 運算的函式庫。在 cuBLAS、cuBLASLt 和 CK [僅限 ROCm] 之間選擇。

警告

此標記為實驗性，如有更改，恕不另行通知。

當 PyTorch 執行 CUDA BLAS 運算時，即使 cuBLAS 和 cuBLASLt 都可用，它預設為 cuBLAS。對於為 ROCm 建置的 PyTorch，hipBLAS、hipBLASLt 和 CK 可能提供不同的效能。此標記（一個 str）允許覆寫要使用的 BLAS 函式庫。

如果設定為 “cublas”，則盡可能使用 cuBLAS。
如果設定為 “cublaslt”，則盡可能使用 cuBLASLt。
如果設定為 “ck”，則盡可能使用 CK。
未給定輸入時，此函式會傳回目前偏好的函式庫。
使用者可以使用環境變數 TORCH_BLAS_PREFER_CUBLASLT=1 將偏好的函式庫全域設定為 cuBLASLt。此標記僅設定偏好的函式庫的初始值，偏好的函式庫仍可能稍後在您的腳本中被此函式呼叫覆寫。

注意：當偏好一個函式庫時，如果偏好的函式庫未實作呼叫的運算，則仍可能使用其他函式庫。如果 PyTorch 的函式庫選擇對於您應用程式的輸入不正確，則此標記可能會實現更好的效能。

傳回類型: _BlasBackend

torch.backends.cuda.preferred_linalg_library(backend=None)[原始碼][原始碼]¶

覆寫 PyTorch 用於在 CUDA 線性代數運算中選擇 cuSOLVER 和 MAGMA 之間的啟發式方法。

警告

此標記為實驗性，如有更改，恕不另行通知。

當 PyTorch 執行 CUDA 線性代數運算時，它通常使用 cuSOLVER 或 MAGMA 函式庫，如果兩者都可用，則它會使用啟發式方法來決定使用哪個。此標記（一個 str）允許覆寫這些啟發式方法。

如果設定為 “cusolver”，則盡可能使用 cuSOLVER。
如果設定為 “magma”，則盡可能使用 MAGMA。
如果設定為 “default”（預設值），則如果 cuSOLVER 和 MAGMA 都可用，則將使用啟發式方法在兩者之間進行選擇。
未給定輸入時，此函式會傳回目前偏好的函式庫。
使用者可以使用環境變數 TORCH_LINALG_PREFER_CUSOLVER=1 將偏好的函式庫全域設定為 cuSOLVER。此標記僅設定偏好的函式庫的初始值，偏好的函式庫仍可能稍後在您的腳本中被此函式呼叫覆寫。

注意：當偏好一個函式庫時，如果偏好的函式庫未實作呼叫的運算，則仍可能使用其他函式庫。如果 PyTorch 的啟發式函式庫選擇對於您應用程式的輸入不正確，則此標記可能會實現更好的效能。

目前支援的 linalg 運算子

傳回類型: _LinalgBackend

class torch.backends.cuda.SDPAParams¶

torch.backends.cuda.flash_sdp_enabled()[原始碼][原始碼]¶: 警告

此標記為 beta 版本，可能會變更。

傳回是否啟用 flash scaled dot product attention。

torch.backends.cuda.enable_mem_efficient_sdp(enabled)[原始碼][原始碼]¶

警告

此標記為 beta 版本，可能會變更。

啟用或停用記憶體效率 scaled dot product attention。

torch.backends.cuda.mem_efficient_sdp_enabled()[原始碼][原始碼]¶: 警告

此標記為 beta 版本，可能會變更。

傳回是否啟用記憶體效率 scaled dot product attention。

torch.backends.cuda.enable_flash_sdp(enabled)[原始碼][原始碼]¶

警告

此標記為 beta 版本，可能會變更。

啟用或停用 flash scaled dot product attention。

torch.backends.cuda.math_sdp_enabled()[原始碼][原始碼]¶: 警告

此標記為 beta 版本，可能會變更。

傳回是否啟用 math scaled dot product attention。

torch.backends.cuda.enable_math_sdp(enabled)[原始碼][原始碼]¶

警告

此標記為 beta 版本，可能會變更。

啟用或停用 math scaled dot product attention。

torch.backends.cuda.fp16_bf16_reduction_math_sdp_allowed()[原始碼][原始碼]¶: 警告

此標記為 beta 版本，可能會變更。

傳回是否允許在 math scaled dot product attention 中進行 fp16/bf16 縮減。

torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(enabled)[原始碼][原始碼]¶

警告

此標記為 beta 版本，可能會變更。

啟用或停用在 math scaled dot product attention 中進行 fp16/bf16 縮減。

torch.backends.cuda.cudnn_sdp_enabled()[原始碼][原始碼]¶: 警告

此標記為 beta 版本，可能會變更。

傳回是否啟用 cuDNN scaled dot product attention。

torch.backends.cuda.enable_cudnn_sdp(enabled)[原始碼][原始碼]¶

警告

此標記為 beta 版本，可能會變更。

啟用或停用 cuDNN scaled dot product attention。

torch.backends.cuda.is_flash_attention_available()[原始碼][原始碼]¶

檢查 PyTorch 是否使用 FlashAttention 建置，以用於 scaled_dot_product_attention。

傳回: 如果 FlashAttention 已建置且可用，則為 True；否則為 False。
傳回類型: bool

注意

此函式取決於 PyTorch 的 CUDA 建置版本。在非 CUDA 環境中，它將傳回 False。

torch.backends.cuda.can_use_flash_attention(params, debug=False)[原始碼][原始碼]¶

檢查是否可以在 scaled_dot_product_attention 中使用 FlashAttention。

參數

params (_SDPAParams) – SDPAParams 的一個實例，包含 query、key、value 的 tensors，一個可選的 attention mask、dropout rate 和一個指示 attention 是否為因果關係的標誌。
debug (bool) – 是否要 logging.warn 記錄 FlashAttention 無法執行的偵錯資訊。預設為 False。

傳回

如果 FlashAttention 可以與給定的參數一起使用，則為 True；否則為 False。

傳回類型

bool

注意

此函式取決於 PyTorch 的 CUDA 建置版本。在非 CUDA 環境中，它將傳回 False。

torch.backends.cuda.can_use_efficient_attention(params, debug=False)[原始碼][原始碼]¶

檢查是否可以在 scaled_dot_product_attention 中使用 efficient_attention。

參數

params (_SDPAParams) – SDPAParams 的一個實例，包含 query、key、value 的 tensors，一個可選的 attention mask、dropout rate 和一個指示 attention 是否為因果關係的標誌。
debug (bool) – 是否使用 logging.warn 記錄訊息，說明 efficient_attention 無法執行的原因。預設為 False。

傳回

如果 efficient_attention 可以與給定的參數一起使用，則為 True；否則為 False。

傳回類型

bool

注意

此函式取決於 PyTorch 的 CUDA 建置版本。在非 CUDA 環境中，它將傳回 False。

torch.backends.cuda.can_use_cudnn_attention(params, debug=False)[原始碼][原始碼]¶

檢查 cudnn_attention 是否可以在 scaled_dot_product_attention 中使用。

參數

params (_SDPAParams) – SDPAParams 的一個實例，包含 query、key、value 的 tensors，一個可選的 attention mask、dropout rate 和一個指示 attention 是否為因果關係的標誌。
debug (bool) – 是否使用 logging.warn 記錄訊息，說明 cuDNN attention 無法執行的原因。預設為 False。

傳回

如果 cuDNN 可以與給定的參數一起使用，則為 True；否則為 False。

傳回類型

bool

注意

此函式取決於 PyTorch 的 CUDA 建置版本。在非 CUDA 環境中，它將傳回 False。

torch.backends.cuda.sdp_kernel(enable_flash=True, enable_math=True, enable_mem_efficient=True, enable_cudnn=True)[原始碼][原始碼]¶

警告

此標記為 beta 版本，可能會變更。

此上下文管理器可用於臨時啟用或停用 scaled dot product attention 的三個後端中的任何一個。退出上下文管理器時，標誌的先前狀態將被恢復。

torch.backends.cudnn¶

torch.backends.cudnn.version()[原始碼][原始碼]¶: 返回 cuDNN 的版本。

torch.backends.cudnn.is_available()[原始碼][原始碼]¶: 返回一個布林值，指示目前 CUDNN 是否可用。

torch.backends.cudnn.enabled¶: 一個 bool，用於控制是否啟用 cuDNN。

torch.backends.cudnn.allow_tf32¶: 一個 bool，用於控制在 Ampere 或更新的 GPU 上，cuDNN 卷積中是否可以使用 TensorFloat-32 張量核心。請參閱 Ampere（及更新版本）裝置上的 TensorFloat-32 (TF32)。

torch.backends.cudnn.deterministic¶: 一個 bool，如果為 True，會導致 cuDNN 僅使用確定性的卷積演算法。另請參閱 torch.are_deterministic_algorithms_enabled() 和 torch.use_deterministic_algorithms()。

torch.backends.cudnn.benchmark¶: 一個 bool，如果為 True，會導致 cuDNN 對多個卷積演算法進行基準測試並選擇最快的演算法。

torch.backends.cudnn.benchmark_limit¶: 一個 int，用於指定當 torch.backends.cudnn.benchmark 為 True 時，要嘗試的最大 cuDNN 卷積演算法數量。將 benchmark_limit 設為零以嘗試每個可用的演算法。請注意，此設定僅影響通過 cuDNN v8 API 調度的卷積。

torch.backends.cusparselt¶

torch.backends.cusparselt.version()[原始碼][原始碼]¶

返回 cuSPARSELt 的版本

傳回類型: Optional[int]

torch.backends.cusparselt.is_available()[原始碼][原始碼]¶

返回一個布林值，指示 cuSPARSELt 目前是否可用。

傳回類型: bool

torch.backends.mha¶

torch.backends.mha.get_fastpath_enabled()[原始碼][原始碼]¶

返回 TransformerEncoder 和 MultiHeadAttention 的快速路徑是否已啟用，如果 jit 正在編寫腳本，則返回 True。

..note: 即使 get_fastpath_enabled 返回 True，除非滿足輸入的所有條件，否則快速路徑可能不會運行。

傳回類型: bool

torch.backends.mha.set_fastpath_enabled(value)[原始碼][原始碼]¶

設定是否啟用快速路徑

torch.backends.mps¶

torch.backends.mps.is_available()[原始碼][原始碼]¶

返回一個布林值，指示 MPS 目前是否可用。

傳回類型: bool

torch.backends.mps.is_built()[原始碼][原始碼]¶

返回 PyTorch 是否建構時包含 MPS 支援。

請注意，這不一定表示 MPS 可用；只是表示如果這個 PyTorch 二進位檔在具有正常運作的 MPS 驅動程式和裝置的機器上運行，我們就可以使用它。

傳回類型: bool

torch.backends.mkl¶

torch.backends.mkl.is_available()[原始碼][原始碼]¶: 返回 PyTorch 是否建構時包含 MKL 支援。

class torch.backends.mkl.verbose(enable)[原始碼][原始碼]¶

隨選的 oneMKL 詳盡模式功能。

為了更容易偵錯效能問題，oneMKL 可以傾印包含執行資訊（例如執行核心時的持續時間）的詳盡訊息。詳盡模式功能可以透過名為 MKL_VERBOSE 的環境變數來調用。但是，這種方法會在所有步驟中傾印訊息。這些是大量的詳盡訊息。此外，為了調查效能問題，通常取得單次迭代的詳盡訊息就足夠了。這種隨選的詳盡模式功能可以控制傾印詳盡訊息的範圍。在以下範例中，詳盡訊息只會傾印出第二次推論。

import torch
model(data)
with torch.backends.mkl.verbose(torch.backends.mkl.VERBOSE_ON):
    model(data)

參數: level – 詳盡模式層級 - VERBOSE_OFF：停用詳盡模式 - VERBOSE_ON：啟用詳盡模式

torch.backends.mkldnn¶

torch.backends.mkldnn.is_available()[原始碼][原始碼]¶: 返回 PyTorch 是否建構時包含 MKL-DNN 支援。

class torch.backends.mkldnn.verbose(level)[原始碼][原始碼]¶

隨選的 oneDNN (以前的 MKL-DNN) 詳盡模式功能。

為了更容易偵錯效能問題，oneDNN 可以傾印包含核心大小、輸入資料大小和執行持續時間等資訊的詳盡訊息。詳盡模式功能可以透過名為 DNNL_VERBOSE 的環境變數來調用。但是，這種方法會在所有步驟中傾印訊息。這些是大量的詳盡訊息。此外，為了調查效能問題，通常取得單次迭代的詳盡訊息就足夠了。這種隨選的詳盡模式功能可以控制傾印詳盡訊息的範圍。在以下範例中，詳盡訊息只會傾印出第二次推論。

import torch
model(data)
with torch.backends.mkldnn.verbose(torch.backends.mkldnn.VERBOSE_ON):
    model(data)

參數: level – 詳盡模式層級 - VERBOSE_OFF：停用詳盡模式 - VERBOSE_ON：啟用詳盡模式 - VERBOSE_ON_CREATION：啟用詳盡模式，包含 oneDNN 核心建立

torch.backends.nnpack¶

torch.backends.nnpack.is_available()[原始碼][原始碼]¶: 返回 PyTorch 是否以 NNPACK 支援進行建置。

torch.backends.nnpack.flags(enabled=False)[原始碼][原始碼]¶: 用於設定是否全域啟用 nnpack 的上下文管理器

torch.backends.nnpack.set_flags(_enabled)[原始碼][原始碼]¶: 設定是否全域啟用 nnpack

torch.backends.openmp¶

torch.backends.openmp.is_available()[原始碼][原始碼]¶: 返回 PyTorch 是否以 OpenMP 支援進行建置。

torch.backends.opt_einsum¶

torch.backends.opt_einsum.is_available()[原始碼][原始碼]¶

返回一個布林值，指示 opt_einsum 目前是否可用。

您必須安裝 opt-einsum，torch 才能自動優化 einsum。為了讓 opt-einsum 可用，您可以將它與 torch 一起安裝：pip install torch[opt-einsum] 或單獨安裝： pip install opt-einsum。如果套件已安裝，torch 將自動匯入並相應地使用它。使用此函數檢查 opt-einsum 是否已安裝並由 torch 正確匯入。

傳回類型: bool

torch.backends.opt_einsum.get_opt_einsum()[原始碼][原始碼]¶

如果 opt_einsum 目前可用，則返回 opt_einsum 套件，否則返回 None。

傳回類型: Any

torch.backends.opt_einsum.enabled¶

一個 bool，用於控制是否啟用 opt_einsum（預設為 True）。如果是，torch.einsum 將使用 opt_einsum (https://optimized-einsum.readthedocs.io/en/stable/path_finding.html)（如果可用）來計算最佳收縮路徑，以獲得更快的效能。

如果 opt_einsum 不可用，torch.einsum 將回復為預設的從左到右的收縮路徑。

torch.backends.opt_einsum.strategy¶: 一個 str，用於指定當 torch.backends.opt_einsum.enabled 為 True 時要嘗試的策略。預設情況下，torch.einsum 將嘗試 "auto" 策略，但也支援 "greedy" 和 "optimal" 策略。請注意，"optimal" 策略在輸入數量上是階乘的，因為它會嘗試所有可能的路徑。請參閱 opt_einsum 文件中的更多詳細資訊 (https://optimized-einsum.readthedocs.io/en/stable/path_finding.html)。

torch.backends¶

torch.backends.cpu¶

torch.backends.cuda¶

torch.backends.cudnn¶

torch.backends.cusparselt¶

torch.backends.mha¶

torch.backends.mps¶

torch.backends.mkl¶

torch.backends.mkldnn¶

torch.backends.nnpack¶

torch.backends.openmp¶

torch.backends.opt_einsum¶

torch.backends.xeon¶

文件

教學

資源