MultiAgentConvNet¶
- class torchrl.modules.MultiAgentConvNet(n_agents: int, centralized: ~typing.Optional[bool] = None, share_params: ~typing.Optional[bool] = None, *, in_features: ~typing.Optional[int] = None, device: ~typing.Optional[~typing.Union[~torch.device, str, int]] = None, num_cells: ~typing.Optional[~typing.Sequence[int]] = None, kernel_sizes: ~typing.Union[~typing.Sequence[~typing.Union[int, ~typing.Sequence[int]]], int] = 5, strides: ~typing.Union[~typing.Sequence, int] = 2, paddings: ~typing.Union[~typing.Sequence, int] = 0, activation_class: ~typing.Type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ELU'>, use_td_params: bool = True, **kwargs)[原始碼]¶
多代理 CNN。
在 MARL 設定中,代理人可能會或可能不會共享相同的動作策略:我們說參數可以共享或不共享。類似地,網路可以採用整個觀察空間(跨代理人),也可以基於每個代理人來計算其輸出,我們分別將其稱為「集中式」和「非集中式」。
它期望形狀為
(*B, n_agents, channels, x, y)
的輸入。注意
若要使用 torch.nn.init 模組初始化 MARL 模組參數,請參閱
get_stateful_net()
和from_stateful_net()
方法。- 參數:
- 關鍵字參數:
in_features (int, optional) – 輸入特徵維度。 如果保留為
None
,則使用 lazy module。device (str or torch.device, optional) – 在其上建立 module 的裝置。
num_cells (int or Sequence[int], optional) – 輸入和輸出之間每層的 cell 數量。 如果提供整數,則每一層將具有相同數量的 cell。 如果提供可迭代物件,則線性層的
out_features
將與num_cells
的內容相符。kernel_sizes (int, Sequence[Union[int, Sequence[int]]]) – 卷積網路的 Kernel size。 預設為
5
。strides (int or Sequence[int]) – 卷積網路的步幅。 如果是可迭代物件,則長度必須與深度匹配,該深度由 num_cells 或 depth 參數定義。 預設為
2
。activation_class (Type[nn.Module]) – 要使用的 activation 類別。 預設為
torch.nn.ELU
。use_td_params (bool, optional) – 如果
True
,則可以在 self.params 中找到參數,它是TensorDictParams
物件(繼承自 TensorDict 和 nn.Module)。 如果False
,則參數包含在 self._empty_net 中。 總而言之,這兩種方法應該大致相同但不可以互換:例如,使用use_td_params=True
建立的state_dict
在use_td_params=False
時無法使用。**kwargs – 用於
ConvNet
,可用於自訂 ConvNet。
範例
>>> import torch >>> from torchrl.modules import MultiAgentConvNet >>> batch = (3,2) >>> n_agents = 7 >>> channels, x, y = 3, 100, 100 >>> obs = torch.randn(*batch, n_agents, channels, x, y) >>> # Let's consider a centralized network with shared parameters. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralized = True, ... share_params = True ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0): ConvNet( (0): LazyConv2d(0, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> result = cnn(obs) >>> # The final dimension of the resulting tensor would be determined based on the layer definition arguments and the shape of input 'obs'. >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> # Since both observations and parameters are shared, we expect all agents to have identical outputs (eg. for a value function) >>> print(all(result[0,0,0] == result[0,0,1])) True
>>> # Alternatively, a local network with parameter sharing (eg. decentralized weight sharing policy) >>> cnn = MultiAgentConvNet( ... n_agents, ... centralized = False, ... share_params = True ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0): ConvNet( (0): Conv2d(4, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> # Parameters are shared but not observations, hence each agent has a different output. >>> print(all(result[0,0,0] == result[0,0,1])) False
>>> # Or multiple local networks identical in structure but with differing weights. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralized = False, ... share_params = False ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0-6): 7 x ConvNet( (0): Conv2d(4, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> print(all(result[0,0,0] == result[0,0,1])) False
>>> # Or where inputs are shared but not parameters. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralized = True, ... share_params = False ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0-6): 7 x ConvNet( (0): Conv2d(28, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> print(all(result[0,0,0] == result[0,0,1])) False