convert_fx¶

class torch.ao.quantization.quantize_fx.convert_fx(graph_module, convert_custom_config=None, _remove_qconfig=True, qconfig_mapping=None, backend_config=None)[原始碼][原始碼]¶

將校正或訓練過的模型轉換為量化模型

參數

graph_module (*) – 準備好並經過校正/訓練的模型 (GraphModule)
convert_custom_config (*) – 用於 convert 函數的自定義配置。詳情請參閱 ConvertCustomConfig。
_remove_qconfig (*) – 選項，用於在 convert 後移除模型中的 qconfig 屬性。
qconfig_mapping (*) –
用於指定如何轉換模型以進行量化的配置。
鍵值必須包含傳遞給 prepare_fx 或 prepare_qat_fx 的 qconfig_mapping 中的鍵，使用相同的值或 None。可以指定其他鍵，並將值設置為 None。

對於每個值設置為 None 的條目，我們將跳過模型中該條目的量化
qconfig_mapping = QConfigMapping .set_global(qconfig_from_prepare) .set_object_type(torch.nn.functional.add, None) # skip quantizing torch.nn.functional.add .set_object_type(torch.nn.functional.linear, qconfig_from_prepare) .set_module_name("foo.bar", None) # skip quantizing module "foo.bar"
- backend_config (BackendConfig): 後端的配置，描述了後端應如何量化運算符，包括量化模式支持（static/dynamic/weight_only）、dtype 支持（quint8/qint8 等）、每個運算符的觀察器位置以及融合運算符。詳情請參閱 BackendConfig。

返回

量化後的模型 (torch.nn.Module)

返回類型

GraphModule

範例

# prepared_model: the model after prepare_fx/prepare_qat_fx and calibration/training
# convert_fx converts a calibrated/trained model to a quantized model for the
# target hardware, this includes converting the model first to a reference
# quantized model, and then lower the reference quantized model to a backend
# Currently, the supported backends are fbgemm (onednn), qnnpack (xnnpack) and
# they share the same set of quantized operators, so we are using the same
# lowering procedure
#
# backend_config defines the corresponding reference quantized module for
# the weighted modules in the model, e.g. nn.Linear
# TODO: add backend_config after we split the backend_config for fbgemm and qnnpack
# e.g. backend_config = get_default_backend_config("fbgemm")
quantized_model = convert_fx(prepared_model)

convert_fx¶

文件

教程

資源