实施问题:用于模式识别的Deep ConvNet

问题描述 投票:0回答:1

我正在尝试使用完全卷积的网络来实现模式识别模型(https://www.sciencedirect.com/science/article/pii/S0031320318304370中的图1,我能够在不登录或没有任何内容的情况下获得全文,但是如果有问题,我也可以附上图片! ),但从最终的Conv2D图层移至第一个fc_layer时出现尺寸错误。

这是我的错误消息:

RuntimeError: size mismatch, m1: [4 x 1024], m2: [4 x 1024] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:283

最初,如图所示,我的第一个线性层是:

nn.Linear(4*4*512, 1024)

但是在得到大小不匹配后,我将其更改为:

nn.Linear(4,1024)

现在,我收到了上面写的奇怪的错误消息。

供参考(如果有帮助),这是我的代码:


import torch.nn as nn
import torch.utils.model_zoo as model_zoo

class convnet(nn.Module):

    def __init__(self, num_classes=1000):
        super(convnet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(1, 64, kernel_size=3, stride=2, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.MaxPool2d(kernel_size=1),
            nn.Conv2d(64, 128, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(128, 128, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),# stride=2),
            nn.Conv2d(128, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2), #stride=2),
            nn.Conv2d(256, 512, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.ReLU(inplace=True), #nn.Dropout(p=0.5)
        )

        self.classifier = nn.Sequential(
            nn.Linear(4, 1024),
            nn.Dropout(p=0.5),
            nn.ReLU(inplace=True),
            #nn.Dropout(p=0.5),
            nn.Linear(1024, 1024),
            nn.ReLU(inplace=True),
            nn.Linear(1024, num_classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = torch.flatten(x,1)
        x = self.classifier(x)
        return x

我怀疑这是填充和步幅问题。谢谢!

deep-learning pytorch conv-neural-network pattern-recognition
1个回答
0
投票

误差来自矩阵乘法,其中m1应该是m x n矩阵,m2n x p矩阵,结果将是m x p矩阵。在您的情况下,它是4 x 10244 x 1024,但是自1024 != 4以来不起作用。

这意味着您对第一线性层的输入大小为[[[4,1024]] >>(4为批处理大小),因此第一线性层的输入要素应为1024。

self.classifier = nn.Sequential( nn.Linear(1024, 1024), nn.Dropout(p=0.5), nn.ReLU(inplace=True), #nn.Dropout(p=0.5), nn.Linear(1024, 1024), nn.ReLU(inplace=True), nn.Linear(1024, num_classes), )
如果不确定输入具有多少要素,则可以在图层之前打印出其尺寸:

x = self.features(x) x = torch.flatten(x,1) print(x.size()) # => torch.Size([4, 1024]) x = self.classifier(x)

© www.soinside.com 2019 - 2024. All rights reserved.