Tensorflow 拟合不会采用自定义数据集,但会抛出错误:ValueError:无法采用未知等级的形状长度

问题描述 投票:0回答:1

我正在尝试训练一个直到最近之前都运行良好的模型。 fit 函数抛出以下错误:

    <ipython-input-20-01755a6ded38> in <cell line: 1>()
----> 1 model.fit(
      2     dataset,
      3     epochs=100,
      4     verbose=1,
      5     batch_size=8)

1 frames /usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py in error_handler(*args, **kwargs)
    120             # To get the full stack trace, call:
    121             # `keras.config.disable_traceback_filtering()`
--> 122             raise e.with_traceback(filtered_tb) from None
    123         finally:
    124             del filtered_tb

/usr/local/lib/python3.10/dist-packages/keras/src/losses/loss.py in squeeze_or_expand_to_same_rank(x1, x2, expand_rank_1)
    105 def squeeze_or_expand_to_same_rank(x1, x2, expand_rank_1=True):
    106     """Squeeze/expand last dim if ranks differ from expected by exactly 1."""
--> 107     x1_rank = len(x1.shape)
    108     x2_rank = len(x2.shape)
    109     if x1_rank == x2_rank:

ValueError: Cannot take the length of shape with unknown rank.

我尝试仅将生成器生成的 x 和 y 传递到 fit 中,并且效果很好,所以这不是形状问题。 这是错误的重现。该模型只是一个简单的顺序模型:

model = keras.models.Sequential(
    [keras.layers.Dense(10, activation='relu'),
     keras.layers.Dense(1, activation='sigmoid')]
)
model.compile(loss='binary_crossentropy', metrics=['accuracy'])

数据集是由 tf.data.Dataset.from_generator() 生成的,如下所示:

#Random dataset
a = tf.convert_to_tensor(np.random.randint(0,100, size=[10,10]))
    
#Data generator class
class DataGenerator:
   def __init__(self, data, ratio=3):
      self._ratio = ratio
      self._data = data
    
   def __call__(self):
      shape = tf.shape(self._data).numpy()
      x = tf.convert_to_tensor(np.random.randint(1000,100000, size=[shape[0] * self._ratio, shape[1]]))
      x = tf.concat([self._data, x], axis = 0)
      y = tf.convert_to_tensor(np.random.random(shape[0]*(1 + self._ratio)))
    
      yield x, y


data_gen = DataGenerator(a, 3)
dataset = tf.data.Dataset.from_generator(
            data_gen,
            output_signature=(
                tf.TensorSpec(shape=(None,10), dtype=tf.int32),
                tf.TensorSpec(shape=(None), dtype=tf.float32)))

Model.fit() 产生了上述错误:

model.fit(
    dataset,
    epochs=100,
    verbose=1,
    batch_size=8)

这是 Colab 中错误的重现:https://colab.research.google.com/drive/1f7I2St2U3LxaWZZSTxT2xWN14gCZMBSE?usp=sharing

python tensorflow machine-learning keras tensorflow-datasets
1个回答
0
投票

当损失函数接收的张量与声明生成器时指定的张量类型不同时,会发生指定的错误。 在您的情况下,生成器的输出规范必须包含 int64 和 float64 类型。

data_gen = DataGenerator(a, 3)
dataset = tf.data.Dataset.from_generator(
        data_gen,
        output_signature=(
            tf.TensorSpec(shape=(None,10), dtype=tf.int64), # <- here
            tf.TensorSpec(shape=(None,), dtype=tf.float64))) # <- and here

更改类型后,您的 Colab 示例将开始运行: Colab example

© www.soinside.com 2019 - 2024. All rights reserved.