使用序列数据进行多类分类,LSTM Keras 不起作用

问题描述 投票:0回答:1

我正在尝试对顺序数据进行多类分类,以根据源的累积读取来了解某些事件的来源。

我使用具有 64 个单元的简单 LSTM 层和具有与目标相同数量的单元的密集层。该模型似乎没有学到任何东西,因为准确度仍然约为 1%。 def create_model(): 模型 = 顺序()

model.add(LSTM(64, return_sequences=False))

model.add(Dense(8))
model.add(Activation("softmax"))

model.compile(
    loss="categorical_crossentropy",
    optimizer=Adam(lr=0.00001),
    metrics=["accuracy"],
)

return model

我尝试将学习率更改为非常小的值(0.001、0.0001、1e-5)并进行更大时期的训练,但没有观察到准确性的变化。我在这里错过了什么吗?是我的数据预处理不正确还是模型创建有问题?

预先感谢您的帮助。

数据集


Accumulated- Source-1   Source-2    Source-3  
Reading   
217             0       0       0  
205             0       0       0  
206             0       0       0  
231             0       0       0  
308             0       0       1  
1548            0       0       1  
1547            0       0       1  
1530            0       0       1  
1545            0       0       1  
1544            0       0       1   
1527            0       0       1  
1533            0       0       1  
1527            0       0       1  
1527            0       0       1  
1534            0       0       1  
1520            0       0       1  
1524            0       0       1  
1523            0       0       1  
205             0       0       0  
209             0       0       0  
.  
.  
.  

我创建了一个 SEQ_LEN=5 的滚动窗口数据集,将其馈送到 LSTM 网络:


rolling_window                   labels
[205, 206, 217, 205, 206]       [0, 0, 0]
[206, 217, 205, 206, 231]       [0, 0, 0]
[217, 205, 206, 231, 308]       [0, 0, 1]
[205, 206, 231, 308, 1548]      [0, 0, 1]
[206, 231, 308, 1548, 1547]     [0, 0, 1]
[231, 308, 1548, 1547, 1530]    [0, 0, 1]
[308, 1548, 1547, 1530, 1545]   [0, 0, 1]
[1548, 1547, 1530, 1545, 1544]  [0, 0, 1]
[1547, 1530, 1545, 1544, 1527]  [0, 0, 1]
[1530, 1545, 1544, 1527, 1533]  [0, 0, 1]
[1545, 1544, 1527, 1533, 1527]  [0, 0, 1]
[1544, 1527, 1533, 1527, 1527]  [0, 0, 1]
[1527, 1533, 1527, 1527, 1534]  [0, 0, 1]
[1533, 1527, 1527, 1534, 1520]  [0, 0, 1]
[1527, 1527, 1534, 1520, 1524]  [0, 0, 1]
[1527, 1534, 1520, 1524, 1523]  [0, 0, 1]
[1534, 1520, 1524, 1523, 1520]  [0, 0, 1]
[1520, 1524, 1523, 1520, 205]   [0, 0, 0]
.  
.  
.

重塑数据集

X_train = train_df.rolling_window.values
X_train = X_train.reshape(X_train.shape[0], 1, SEQ_LEN)

Y_train = train_df.labels.values
Y_train = Y_train.reshape(Y_train.shape[0], 3)

型号

def create_model():
    model = Sequential()

    model.add(LSTM(64, input_shape=(1, SEQ_LEN), return_sequences=True))
    model.add(Activation("relu"))

    model.add(Flatten())
    model.add(Dense(3))
    model.add(Activation("softmax"))

    model.compile(
        loss="categorical_crossentropy", optimizer=Adam(lr=0.01), metrics=["accuracy"]
    )

    return model

培训

model = create_model()
model.fit(X_train, Y_train, batch_size=512, epochs=5)

训练输出

Epoch 1/5
878396/878396 [==============================] - 37s 42us/step - loss: 0.2586 - accuracy: 0.0173
Epoch 2/5
878396/878396 [==============================] - 36s 41us/step - loss: 0.2538 - accuracy: 0.0175
Epoch 3/5
878396/878396 [==============================] - 36s 41us/step - loss: 0.2538 - accuracy: 0.0176
Epoch 4/5
878396/878396 [==============================] - 37s 42us/step - loss: 0.2537 - accuracy: 0.0177
Epoch 5/5
878396/878396 [==============================] - 38s 43us/step - loss: 0.2995 - accuracy: 0.0174

[编辑-1]
尝试了 Max 的建议后,结果如下(损失和准确率仍然没有改变)

推荐型号

def create_model():
    model = Sequential()

    model.add(LSTM(64, return_sequences=False))

    model.add(Dense(8))
    model.add(Activation("softmax"))

    model.compile(
        loss="categorical_crossentropy",
        optimizer=Adam(lr=0.001),
        metrics=["accuracy"],
    )

    return model

X_火车


array([[[205],
        [217],
        [209],
        [215],
        [206]],

       [[217],
        [209],
        [215],
        [206],
        [206]],

       [[209],
        [215],
        [206],
        [206],
        [211]],

       ...,

       [[175],
        [175],
        [173],
        [176],
        [174]],

       [[175],
        [173],
        [176],
        [174],
        [176]],

       [[173],
        [176],
        [174],
        [176],
        [173]]])

Y_train(P.S:实际上有8个目标类。上面的例子是对实际问题的简化)


array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

训练输出

Epoch 1/5
878396/878396 [==============================] - 15s 17us/step - loss: 0.1329 - accuracy: 0.0190
Epoch 2/5
878396/878396 [==============================] - 15s 17us/step - loss: 0.1313 - accuracy: 0.0190
Epoch 3/5
878396/878396 [==============================] - 16s 18us/step - loss: 0.1293 - accuracy: 0.0190
Epoch 4/5
878396/878396 [==============================] - 16s 18us/step - loss: 0.1355 - accuracy: 0.0195
Epoch 5/5
878396/878396 [==============================] - 15s 18us/step - loss: 0.1315 - accuracy: 0.0236

[编辑-2]
根据 Max 和 Marcin 的以下建议,准确率大多保持在 3% 以下。虽然只有十分之一,但准确率高达 95%。这完全取决于第一个纪元开始时的准确性。如果它没有在正确的位置开始梯度下降,就无法达到良好的精度。我需要使用不同的初始化程序吗?改变学习率不会带来可重复的结果。

建议:
1. 缩放/标准化 X_train(完成)
2.不重塑Y_train(完成)
3. LSTM层的单元更少(从64个减少到16个)
4. 拥有更小的batch_size(从512减少到64)

缩放X_train

array([[[ 0.01060734],
        [ 0.03920736],
        [ 0.02014085],
        [ 0.03444091],
        [ 0.01299107]],

       [[ 0.03920728],
        [ 0.02014073],
        [ 0.03444082],
        [ 0.01299095],
        [ 0.01299107]],

       [[ 0.02014065],
        [ 0.0344407 ],
        [ 0.01299086],
        [ 0.01299095],
        [ 0.02490771]],

       ...,

       [[-0.06089251],
        [-0.06089243],
        [-0.06565897],
        [-0.05850889],
        [-0.06327543]],

       [[-0.06089251],
        [-0.06565908],
        [-0.05850898],
        [-0.06327555],
        [-0.05850878]],

       [[-0.06565916],
        [-0.0585091 ],
        [-0.06327564],
        [-0.05850889],
        [-0.06565876]]])

未重塑Y_train

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

具有较少 LSTM 单元的模型

def create_model():
    model = Sequential()

    model.add(LSTM(16, return_sequences=False))

    model.add(Dense(8))
    model.add(Activation("softmax"))

    model.compile(
        loss="categorical_crossentropy", optimizer=Adam(lr=0.001), metrics=["accuracy"]
    )

    return model

训练输出

Epoch 1/5
878396/878396 [==============================] - 26s 30us/step - loss: 0.1325 - accuracy: 0.0190
Epoch 2/5
878396/878396 [==============================] - 26s 29us/step - loss: 0.1352 - accuracy: 0.0189
Epoch 3/5
878396/878396 [==============================] - 26s 30us/step - loss: 0.1353 - accuracy: 0.0192
Epoch 4/5
878396/878396 [==============================] - 26s 29us/step - loss: 0.1365 - accuracy: 0.0197
Epoch 5/5
878396/878396 [==============================] - 27s 31us/step - loss: 0.1378 - accuracy: 0.0201
machine-learning keras deep-learning lstm
1个回答
0
投票

序列应该是 LSTM 的第一个维度(输入数组的第二个维度),即:

重塑数据集

X_train = train_df.rolling_window.values
X_train = X_train.reshape(X_train.shape[0], SEQ_LEN, 1)

Y_train = train_df.labels.values
Y_train = Y_train.reshape(Y_train.shape[0], 3)

LSTM 不需要输入形状。 LSTM 默认具有“tanh”激活,这通常是一个不错的选择。

型号

def create_model():
    model = Sequential()

    model.add(LSTM(64, return_sequences=True))
    
    model.add(Flatten())
    model.add(Dense(3))
    model.add(Activation("softmax"))
    
    model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=0.01), metrics=["accuracy"])

    return model

也许不使用 Flatten() 层而是对 LSTM 使用 return_sequences=False 会是更好的选择。试试吧。

还可以尝试对数据的特征缩放进行预处理。数据值看起来相当大。

© www.soinside.com 2019 - 2024. All rights reserved.