模型檢測的特徵提取¶

torchvision.models.feature_extraction 套件包含特徵提取公用程式，可讓我們利用我們的模型來存取輸入的中間轉換。這對於電腦視覺中的各種應用程式非常有用。以下是一些範例

視覺化特徵圖。
提取特徵以計算圖像描述符，用於面部識別、副本檢測或圖像檢索等任務。
將選定的特徵傳遞到下游子網路，以便以特定任務進行端到端訓練。例如，將特徵的層次結構傳遞到具有物件偵測頭的特徵金字塔網路。

Torchvision 提供 create_feature_extractor() 用於此目的。它的工作方式大致如下

以符號追蹤模型，以取得它如何逐步轉換輸入的圖形表示。
將使用者選定的圖形節點設定為輸出。
移除所有冗餘節點（輸出節點下游的任何內容）。
從產生的圖形產生 Python 程式碼，並將其與圖形本身捆綁到 PyTorch 模組中。

torch.fx 文件提供了更一般和詳細的說明，說明了上述程序和符號追蹤的內部運作方式。

關於節點名稱

為了指定哪些節點應該是提取特徵的輸出節點，應該熟悉此處使用的節點命名慣例（與 torch.fx 中使用的慣例略有不同）。節點名稱指定為 . 分隔的路徑，從頂層模組向下到葉子運算或葉子模組，走訪模組層次結構。例如，ResNet-50 中的 "layer4.2.relu" 代表 ResNet 模組的第 4 層的第 2 個區塊的 ReLU 輸出。以下是一些需要記住的更精細的點

當為 create_feature_extractor() 指定節點名稱時，您可以提供截斷版本的節點名稱作為捷徑。若要了解其運作方式，請嘗試建立一個 ResNet-50 模型，並使用 train_nodes, _ = get_graph_node_names(model) print(train_nodes) 列印節點名稱，並觀察與 layer4 相關的最後一個節點是 "layer4.2.relu_2"。可以將 "layer4.2.relu_2" 指定為返回節點，或者按照慣例，僅指定 "layer4"，因為這表示 layer4 中最後一個節點（按照執行順序）。
如果某個模組或操作重複多次，節點名稱會添加一個額外的 _{int} 後綴以消除歧義。例如，加法 (+) 運算在同一個 forward 方法中使用了三次。那麼就會有 "path.to.module.add"、"path.to.module.add_1"、"path.to.module.add_2"。計數器會在直接父級的範圍內維護。因此，在 ResNet-50 中，會有一個 "layer4.1.add" 和一個 "layer4.2.add"。因為加法運算位於不同的區塊中，因此不需要後綴來消除歧義。

範例

以下是一個我們可能如何提取 MaskRCNN 特徵的範例

import torch
from torchvision.models import resnet50
from torchvision.models.feature_extraction import get_graph_node_names
from torchvision.models.feature_extraction import create_feature_extractor
from torchvision.models.detection.mask_rcnn import MaskRCNN
from torchvision.models.detection.backbone_utils import LastLevelMaxPool
from torchvision.ops.feature_pyramid_network import FeaturePyramidNetwork


# To assist you in designing the feature extractor you may want to print out
# the available nodes for resnet50.
m = resnet50()
train_nodes, eval_nodes = get_graph_node_names(resnet50())

# The lists returned, are the names of all the graph nodes (in order of
# execution) for the input model traced in train mode and in eval mode
# respectively. You'll find that `train_nodes` and `eval_nodes` are the same
# for this example. But if the model contains control flow that's dependent
# on the training mode, they may be different.

# To specify the nodes you want to extract, you could select the final node
# that appears in each of the main layers:
return_nodes = {
    # node_name: user-specified key for output dict
    'layer1.2.relu_2': 'layer1',
    'layer2.3.relu_2': 'layer2',
    'layer3.5.relu_2': 'layer3',
    'layer4.2.relu_2': 'layer4',
}

# But `create_feature_extractor` can also accept truncated node specifications
# like "layer1", as it will just pick the last node that's a descendent of
# of the specification. (Tip: be careful with this, especially when a layer
# has multiple outputs. It's not always guaranteed that the last operation
# performed is the one that corresponds to the output you desire. You should
# consult the source code for the input model to confirm.)
return_nodes = {
    'layer1': 'layer1',
    'layer2': 'layer2',
    'layer3': 'layer3',
    'layer4': 'layer4',
}

# Now you can build the feature extractor. This returns a module whose forward
# method returns a dictionary like:
# {
#     'layer1': output of layer 1,
#     'layer2': output of layer 2,
#     'layer3': output of layer 3,
#     'layer4': output of layer 4,
# }
create_feature_extractor(m, return_nodes=return_nodes)

# Let's put all that together to wrap resnet50 with MaskRCNN

# MaskRCNN requires a backbone with an attached FPN
class Resnet50WithFPN(torch.nn.Module):
    def __init__(self):
        super(Resnet50WithFPN, self).__init__()
        # Get a resnet50 backbone
        m = resnet50()
        # Extract 4 main layers (note: MaskRCNN needs this particular name
        # mapping for return nodes)
        self.body = create_feature_extractor(
            m, return_nodes={f'layer{k}': str(v)
                             for v, k in enumerate([1, 2, 3, 4])})
        # Dry run to get number of channels for FPN
        inp = torch.randn(2, 3, 224, 224)
        with torch.no_grad():
            out = self.body(inp)
        in_channels_list = [o.shape[1] for o in out.values()]
        # Build FPN
        self.out_channels = 256
        self.fpn = FeaturePyramidNetwork(
            in_channels_list, out_channels=self.out_channels,
            extra_blocks=LastLevelMaxPool())

    def forward(self, x):
        x = self.body(x)
        x = self.fpn(x)
        return x


# Now we can build our model!
model = MaskRCNN(Resnet50WithFPN(), num_classes=91).eval()

API 參考¶

`create_feature_extractor`(model[, ...])	創建一個新的圖形模組，該模組從給定的模型返回中間節點，作為一個字典，其中使用者指定的鍵作為字串，請求的輸出作為值。
`get_graph_node_names`(model[, tracer_kwargs, ...])	開發實用程式，用於按執行順序返回節點名稱。

模型檢測的特徵提取¶

API 參考¶

文件

教學課程

資源