输入0与层repeat_vector_40不兼容：预期ndim = 2，发现ndim = 1

Question

我正在开发一个用于异常检测的 LSTM 自动编码器模型。我的 keras 模型设置如下：

from keras.models import Sequential

from keras import Model, layers
from keras.layers import Layer, Conv1D, Input, Masking, Dense, RNN, LSTM, Dropout, RepeatVector, TimeDistributed, Masking, Reshape

def create_RNN_with_attention():
    x=Input(shape=(X_train_dt.shape[1], X_train_dt.shape[2]))
    RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)
    attention_layer = attention()(RNN_layer_1)
    dropout_layer_1 = Dropout(rate=0.2)(attention_layer)
    repeat_vector_layer = RepeatVector(n=X_train_dt.shape[1])(dropout_layer_1)
    RNN_layer_2 = LSTM(units=64, return_sequences=True)(repeat_vector_layer)
    dropout_layer_1 = Dropout(rate=0.2)(RNN_layer_2)
    output = TimeDistributed(Dense(X_train_dt.shape[2], trainable=True))(dropout_layer_1)
    model=Model(x,output)
    model.compile(loss='mae', optimizer='adam')    
    return model

注意我添加的注意力层，

attention_layer

。在添加此之前，模型编译完美，但是在添加此attention_layer之后 - 模型抛出以下错误：

ValueError: Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1

我的注意力层设置如下：

import keras.backend as K
class attention(Layer):
    def __init__(self,**kwargs):
        super(attention,self).__init__(**kwargs)
 
    def build(self,input_shape):
        self.W=self.add_weight(name='attention_weight', shape=(input_shape[-1],1), 
                               initializer='random_normal', trainable=True)
        self.b=self.add_weight(name='attention_bias', shape=(input_shape[1],1), 
                               initializer='zeros', trainable=True)        
        super(attention, self).build(input_shape)
 
    def call(self,x):
        # Alignment scores. Pass them through tanh function
        e = K.tanh(K.dot(x,self.W)+self.b)
        # Remove dimension of size 1
        e = K.squeeze(e, axis=-1)   
        # Compute the weights
        alpha = K.softmax(e)
        # Reshape to tensorFlow format
        alpha = K.expand_dims(alpha, axis=-1)
        # Compute the context vector
        context = x * alpha
        context = K.sum(context, axis=1)
        return context

注意力掩模的想法是让模型像火车一样关注更突出的特征。

为什么我会收到上述错误以及如何解决此问题？

Answer 1

我认为问题出在这一行：

RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)

该层输出形状为

(batch_size, 64)

的张量。所以这意味着你输出一个向量，然后在 w.r.t 上运行注意力机制。批量维度而不是顺序维度。这也意味着您的输出具有压缩的批量尺寸，这对于任何

keras

层来说都是不可接受的。这就是为什么

Repeat

层会产生错误，因为它期望向量的形状至少为

(batch_dimension, dim)

。

如果你想在序列上运行注意力机制，那么你应该将上面提到的行切换为：

RNN_layer_1 = LSTM(units=64, return_sequences=True)(x)

Answer 2

在注意力模型中，通常不使用“RepeatVector”层。该层有助于重复输入向量与输出时间一样多的次数。但是当使用注意力机制时，不需要重复输出向量，因为重要性适用于所有时间。

更具体地说，在您的模型中，

LSTM'' layer is first taken with

attention''层中的

RNN_layer_1''. Then, by applying the attention mechanism (through

return_sequences=True''和

RepeatVector'' to repeat vectors), the importances are determined for each time. Finally, with

TimeDistributed Dense''的输出，每次都会计算输出。

因此，这里不需要

RepeatVector

层，应该将其删除。

输入0与层repeat_vector_40不兼容：预期ndim = 2，发现ndim = 1

问题描述投票：0回答：2

2个回答

最新问题

输入0与层repeat_vector_40不兼容：预期ndim = 2，发现ndim = 1

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2