我有以下要量化的 Tensorflow 模型:
model = Sequential([
Input(shape=input_shape),
LSTM(lstm_units_1, return_sequences=True),
Dropout(dropout_rate),
LSTM(lstm_units_2, return_sequences=False),
Dropout(dropout_rate),
Dense(4, activation='softmax')
])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
model_checkpoint = ModelCheckpoint(model_path, monitor='val_loss', save_best_only=True, save_weights_only=False, mode='min')
history = model.fit(X_train, y_train,
epochs=epochs,
batch_size=batch_size,
validation_split=0.2,
callbacks=[early_stopping],
verbose=1)
model.save(model_path)
我正在尝试像这样执行量化:
annotated_model = tfmot.quantization.keras.quantize_annotate_model(model)
with tfmot.quantization.keras.quantize_scope():
quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)
quant_aware_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
但我收到此错误:
ValueError: `to_annotate` can only be a `keras.Model` instance. Use the `quantize_annotate_layer` API to handle individual layers. You passed an instance of type: Sequential.
尝试量化每一层,因为错误提示对我来说不起作用,还有另一个关于 LSTM 层不被接受作为输入的值错误。
annotated_model = tf.keras.Sequential([
tfmot.quantization.keras.quantize_annotate_layer(layer)
for layer in model.layers
])
量化我在这里使用的特定模型的正确方法是什么?
用纯 TensorFlow 重写 TensorFlow 模型,或者使用 Keras。但是,不要将两者混用,因为这可能会导致许多错误。我建议一切都使用 Keras。
import keras
from keras import layers
from keras import ops
import numpy as np
# Define Sequential model with 3 layers
model = keras.Sequential(
[
layers.Dense(2, activation="relu", name="layer1"),
layers.Dense(3, activation="relu", name="layer2"),
layers.Dense(4, name="layer3"),
]
)
# Compile the model
model.compile(
optimizer='adam',
loss='mean_squared_error',
metrics=['accuracy']
)
# Example data for training
x_train = np.random.random((100, 3)).astype('float32')
y_train = np.random.random((100, 4)).astype('float32')
# Train the model
model.fit(x_train, y_train, epochs=10, batch_size=32)
# Convert the model to TensorFlow Lite with post-training quantization to float16
import tensorflow as tf
# Convert the model to a TensorFlow Lite model
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_model = converter.convert()
# Save the quantized model
with open("model_quantized_f16.tflite", "wb") as f:
f.write(tflite_model)