Keras 3 Attention 层不接受之前 LSTM 层的输出

Question

作为熟悉 Keras 的练习，我想训练一个简单的模型，注意翻译句子。

我收到以下错误：

KerasTensor 不能用作 TensorFlow 函数的输入。 KerasTensor 是形状和数据类型的符号占位符，在构建 Keras 函数模型或 Keras 函数时使用。您只能将其用作 Keras 层或 Keras 操作的输入（来自命名空间

keras.layers

和

keras.operations

）。 [...]

我没有显式调用 tf 函数，仅使用 Keras 层。

我想学习如何使用 Keras 3（因为我对支持其他后端感兴趣）而不是回到 Keras 2。

如有任何帮助，我们将不胜感激！

以下是使用 Keras 函数式 API 的模型代码：

encoder_inputs = tf.keras.layers.Input(shape=[], dtype=tf.string)
decoder_inputs = tf.keras.layers.Input(shape=[], dtype=tf.string)

embed_size = 128
encoder_inputs_ids = text_vec_layer_en(encoder_inputs)
decoder_inputs_ids = text_vec_layer_es(decoder_inputs)
encoder_embedding_layer = tf.keras.layers.Embedding(vocab_size, embed_size, mask_zero=True)
decoder_embedding_layer = tf.keras.layers.Embedding(vocab_size, embed_size, mask_zero=True)
encoder_embeddings = encoder_embedding_layer(encoder_inputs_ids)
decoder_embeddings = decoder_embedding_layer(decoder_inputs_ids)

encoder = tf.keras.layers.LSTM(512, return_sequences=True, return_state=True)
encoder_outputs, *encoder_state = encoder(encoder_embeddings)

decoder = tf.keras.layers.LSTM(512, return_sequences=True)
decoder_outputs = decoder(decoder_embeddings, initial_state=encoder_state)

# Attention layer here!
# Problems getting it to work on Keras 3
attention_layer = tf.keras.layers.Attention()
attention_outputs = attention_layer([decoder_outputs, encoder_outputs])

output_layer = tf.keras.layers.Dense(vocab_size, activation="softmax")
Y_probas = output_layer(attention_outputs)

预期行为：Keras 注意力层接受 Keras 张量输入。

Answer 1

开发人员对我提出的 Github 问题的建议是使用 MultiHeadAttention，它直接接受前几层的输出，并且更适合翻译。

就我而言，我有兴趣熟悉 Keras 界面，因此这并不能从本质上解决我的问题（为什么注意力层不起作用？这背后是否有一些逻辑或者是一个错误？）但是该解决方案有效，您可以获得一个工作模型。

Keras 3 Attention 层不接受之前 LSTM 层的输出

问题描述投票：0回答：1

1个回答

最新问题

Keras 3 Attention 层不接受之前 LSTM 层的输出

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1