我正在学习 TensorFlow 的时间序列教程,它接收天气数据并预测未来的温度值。
我很难理解窗口生成器的工作原理以及每个变量的行为方式。
这是生成窗口的代码:
wide_window = WindowGenerator(input_width=24, label_width=1, 移位=1,label_columns=['T(摄氏度)'])
现在,根据教程,这意味着模型会获取 24 小时的天气数据,并预测接下来 1 小时(也称为 1 值)的温度。这有效。
但是,如果我将 shift(预测未来多远)更改为任何非 1 的值:
wide_window = WindowGenerator(input_width=24, label_width=1, 移位=3,label_columns=['T(摄氏度)'])
然后模型没有返回任何预测,这使得 WindowGenerator 的 Plot 函数崩溃。
同样,如果我将 label_width (根据其外观向后移动要做出多少个预测)更改为非 24 或 1 的任何值:
wide_window = WindowGenerator(input_width=24, label_width=3, 移位=1,label_columns=['T(摄氏度)'])
然后 model.fit 方法崩溃并出现错误:
Exception has occurred: ValueError
Dimensions must be equal, but are 3 and 24 for '{{node compile_loss/mean_squared_error/sub}} = Sub[T=DT_FLOAT](data_1, sequential_1/dense_1/Add)' with input shapes: [?,3,1], [?,24,1].
我的问题是:为什么?为什么我只能预测24个预测或1个预测,而不能预测3个?为什么我可以预测未来的 1 步但不能更进一步?
这是代码的相关部分(为了保持此代码片段的紧凑,我省略了教程中数据的下载和格式化):
# WINDOW GENERATOR CLASS
class WindowGenerator():
def __init__(self, input_width, label_width, shift,
train_df=train_df, val_df=val_df, test_df=test_df,
label_columns=None):
# Store the raw data.
self.train_df = train_df
self.val_df = val_df
self.test_df = test_df
# Work out the label column indices.
self.label_columns = label_columns
if label_columns is not None:
self.label_columns_indices = {name: i for i, name in
enumerate(label_columns)}
self.column_indices = {name: i for i, name in
enumerate(train_df.columns)}
# Work out the window parameters.
self.input_width = input_width
self.label_width = label_width
self.shift = shift
self.total_window_size = input_width + shift
self.input_slice = slice(0, input_width)
self.input_indices = np.arange(self.total_window_size)[self.input_slice]
self.label_start = self.total_window_size - self.label_width
self.labels_slice = slice(self.label_start, None)
self.label_indices = np.arange(self.total_window_size)[self.labels_slice]
def __repr__(self):
return '\n'.join([
f'Total window size: {self.total_window_size}',
f'Input indices: {self.input_indices}',
f'Label indices: {self.label_indices}',
f'Label column name(s): {self.label_columns}'])
# SPLIT FUNCTION
def split_window(self, features):
inputs = features[:, self.input_slice, :]
labels = features[:, self.labels_slice, :]
if self.label_columns is not None:
labels = tf.stack(
[labels[:, :, self.column_indices[name]] for name in self.label_columns],
axis=-1)
# Slicing doesn't preserve static shape information, so set the shapes
# manually. This way the `tf.data.Datasets` are easier to inspect.
inputs.set_shape([None, self.input_width, None])
labels.set_shape([None, self.label_width, None])
return inputs, labels
# PLOT FUNCTION
def plot(self, model=None, plot_col='T (degC)', max_subplots=3):
inputs, labels = self.example
plt.figure(figsize=(12, 8))
plot_col_index = self.column_indices[plot_col]
max_n = min(max_subplots, len(inputs))
for n in range(max_n):
plt.subplot(max_n, 1, n+1)
plt.ylabel(f'{plot_col} [normed]')
plt.plot(self.input_indices, inputs[n, :, plot_col_index],
label='Inputs', marker='.', zorder=-10)
if self.label_columns:
label_col_index = self.label_columns_indices.get(plot_col, None)
else:
label_col_index = plot_col_index
if label_col_index is None:
continue
plt.scatter(self.label_indices, labels[n, :, label_col_index],
edgecolors='k', label='Labels', c='#2ca02c', s=64)
if model is not None:
predictions = model(inputs)
plt.scatter(self.label_indices, predictions[n, self.label_indices[0]-1:self.label_indices[-1], label_col_index],
marker='X', edgecolors='k', label='Predictions',
c='#ff7f0e', s=64)
# ERROR WHEN SHIFT IS NOT 1 BECAUSE NO PREDICTION:
# Exception has occurred: ValueError. x and y must be the same size
if n == 0:
plt.legend()
plt.xlabel('Time [h]')
plt.show()
# MAKE DATASET FUNCTION
def make_dataset(self, data):
data = np.array(data, dtype=np.float32)
ds = tf.keras.utils.timeseries_dataset_from_array(
data=data,
targets=None,
sequence_length=self.total_window_size,
sequence_stride=1,
shuffle=True,
batch_size=BATCH_SIZE,)
ds = ds.map(self.split_window)
return ds
@property
def train(self):
return self.make_dataset(self.train_df)
@property
def val(self):
return self.make_dataset(self.val_df)
@property
def test(self):
return self.make_dataset(self.test_df)
@property
def example(self):
"""Get and cache an example batch of `inputs, labels` for plotting."""
result = getattr(self, '_example', None)
if result is None:
# No example batch was found, so get one from the `.train` dataset
result = next(iter(self.train))
# And cache it for next time
self._example = result
return result
val_performance = {}
performance = {}
# WIDE WINDOW
wide_window = WindowGenerator(
input_width=24, label_width=1, shift=1,
label_columns=['T (degC)'])
# print(wide_window)
# print('Input shape:', wide_window.example[0].shape)
# print('Output shape:', baseline(wide_window.example[0]).shape)
# LINEAR MODEL
linear = tf.keras.Sequential([
tf.keras.layers.Dense(units=1)
])
# print('Input shape:', single_step_window.example[0].shape)
# print('Output shape:', linear(single_step_window.example[0]).shape)
# COMPILE AND FIT FUNCTION
MAX_EPOCHS = 20
def compile_and_fit(model, window, patience=2):
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
patience=patience,
mode='min')
model.compile(loss=tf.keras.losses.MeanSquaredError(),
optimizer=tf.keras.optimizers.Adam(),
metrics=[tf.keras.metrics.MeanAbsoluteError()])
history = model.fit(window.train, epochs=MAX_EPOCHS,
validation_data=window.val,
callbacks=[early_stopping])
# # ERROR WHEN label_width IS NOT 1 OR 24 (WHY THESE VALUES?):
# Exception has occurred: ValueError
# Dimensions must be equal, but are 3 and 24 for '{{node compile_loss/mean_squared_error/sub}} = Sub[T=DT_FLOAT](data_1, sequential_1/dense_1/Add)' with input shapes: [?,3,1], [?,24,1].
return history
# COMPILE AND FIT THE LINEAR MODEL ONTO THE WIDE WINDOW
history = compile_and_fit(linear, wide_window)
print('Input shape:', wide_window.example[0].shape)
print('Output shape:', linear(wide_window.example[0]).shape)
val_performance['Linear'] = linear.evaluate(wide_window.val, return_dict=True)
performance['Linear'] = linear.evaluate(wide_window.test, verbose=0, return_dict=True)
wide_window.plot(linear)
经过一遍又一遍的研究,我对为什么会发生这种情况的理解如下:
shift=3 不起作用的原因是,在所使用的示例中,我使用的是单步模型。根据定义,这些单步模型只能预测右移 1 个单位(在本例中为 1 小时)的值。对于 3 的偏移,模型必须输出 3 个预测。这就是多步骤模型的工作,这将在本教程中进一步介绍。
Label_width 必须与 Input_width 匹配:label_width需要为1或input_width的值,是因为当调用model.fit时,它接受window.train作为输入参数(x),但输出参数(y)留空(根据https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit)。因此,模型期望输出单个值,或与输入数量匹配的一系列值。
这是迄今为止我对此行为提出的最好解释。在本教程的任何地方,label_width 的值都不是 1 或与 input_width 相同。我不是 100% 理解为什么 1 作为一个值,因为 label_width 的错误总是抱怨尺寸不相等......当然,它应该错误并显示类似的消息
尺寸必须相等,但为 1 和 24但事实并非如此。
如果我学到新东西,我会更新这个答案。