从 CNN 层到线性层时层尺寸不匹配

问题描述 投票:0回答:1

我目前正在尝试动态生成许多具有不同数量的 CNN 层的模型,以测试 Fashion Mnist 数据集的准确性。我的

create_cnn_model
适用于从 1 到 9 的 layer_count 大小,但在 10 之后它会失败。我在计算最后一个线性层的大小时做错了。

对于上下文,我的

create_cnn_model
函数在每 3 个卷积层之后添加一个 MaxPool2D 层。在创建我的
n
CNN 层结束时,我继续添加 Flatten 和 Linear 层。

现在有两个地方我不确定。首先,是我如何在

create_cnn_model
函数中使用“因子”变量。我希望我能正确理解它并正确地利用它。另一个问题是最后一个线性层。老实说,我不确定我的计算是否正确。

#Whats the width and height of our images?
W, H = 28, 28 #
#How many values are in the input? We use this to help determine the size of subsequent layers
D = 28*28 #28 * 28 images 
#Hidden layer size
n = 256 
#How many channels are in the input?
C = 1
#how many filters per convolutional layer
n_filters = 32
#How many classes are there?
classes = 10

leak_rate = 0.01

loss_func = nn.CrossEntropyLoss()

# function altered to support optional batch_norm
def cnnLayer(in_filters, out_filters=None, kernel_size=3, batch_norm=False):
    """
    in_filters: how many channels are coming into the layer
    out_filters: how many channels this layer should learn / output, or `None` if we want to have the same number of channels as the input.
    kernel_size: how large the kernel should be
    batch_norm: defines if a batch norm layer should be included
    """
    if out_filters is None:
        out_filters = in_filters #This is a common pattern, so lets automate it as a default if not asked
    padding=kernel_size//2 #padding to stay the same size
    layers = []
    layers.append(nn.Conv2d(in_filters, out_filters, kernel_size, padding=padding))
    if batch_norm:
        layers.append(nn.BatchNorm2d(out_filters))
    layers.append(nn.LeakyReLU(leak_rate))
    return nn.Sequential( # Combine the layer and activation into a single unit
        *layers
    )


def create_cnn_model(layer_count: int, include_bn_layer: bool):
    layers = []
    factor = 1;
    for layer_index in range(layer_count):
        if layer_index == 0:
            layers.append(cnnLayer(C, n_filters))
        elif layer_index % 3 == 0 and layer_index != 0:
            layers.append(nn.MaxPool2d((2,2)))
            layers.append(cnnLayer(factor*n_filters, 2*factor*n_filters))
            factor = factor * 2
        else:
            layers.append(cnnLayer(factor*n_filters))
    layers.append(nn.Flatten())
    layers.append(nn.Linear((D*n_filters)//(factor), classes))
    return nn.Sequential(*layers)

这是我运行时模型的样子

create_cnn_model(10, False)

模型字符串代表:

Sequential(
  (0): Sequential(
    (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (1): Sequential(
    (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (2): Sequential(
    (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (4): Sequential(
    (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (5): Sequential(
    (0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (6): Sequential(
    (0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (7): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (8): Sequential(
    (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (9): Sequential(
    (0): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (10): Sequential(
    (0): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (11): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (12): Sequential(
    (0): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
  )
  (13): Flatten(start_dim=1, end_dim=-1)
  (14): Linear(in_features=3136, out_features=10, bias=True)
)

错误线

RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x2304 and 3136x10)

如果我遗漏了任何信息,请告诉我,我可能在这上面花了将近 4 个小时。

python math deep-learning pytorch neural-network
1个回答
1
投票

很难说,但通过层传递一个虚拟张量直到倒数第二层,然后检查输出张量的形状:

def create_cnn_model(layer_count: int, include_bn_layer: bool):
    layers = []
    factor = 1;
    dummy = torch.zeros((1, C, W, H))
    for layer_index in range(layer_count):
        if layer_index == 0:
            layers.append(cnnLayer(C, n_filters))
        elif layer_index % 3 == 0 and layer_index != 0:
            layers.append(nn.MaxPool2d((2,2)))
            layers.append(cnnLayer(factor*n_filters, 2*factor*n_filters))
            factor = factor * 2
        else:
            layers.append(cnnLayer(factor*n_filters))
        dummy = nn.Sequential(*layers[:-2])(dummy)
        out_shape = dummy.shape[1:]  # exclude batch dimension
    layers.append(nn.Flatten())
    linear_in_features = out_shape[0] * out_shape[1] * out_shape[2]
    layers.append(nn.Linear(linear_in_features, classes))
    return nn.Sequential(*layers)
© www.soinside.com 2019 - 2024. All rights reserved.