发布解决方案背后的整个想法是我找不到任何具体的答案,所以我将解释我在做什么以及导致错误的原因。我希望下次有人在互联网上搜索错误时它会有所帮助。 我正在尝试实现 LoRA,这是一种降低 ML 模型上加权矩阵的秩的技术。
在基于从 TensorFlow Hub 导入的预训练模型构建修改后的模型时,我不断收到以下错误消息:
ValueError Traceback (most recent call last)
Cell In[36], line 1
----> 1 model = build_model()
2 model.summary()
Cell In[35], line 5
2 inputs = tf.keras.Input(shape=(224, 224, 3))
3 #normalized_inputs = (inputs / 127.5) - 1.0
----> 5 x = vit_model(inputs) # ViT backbone
6 x = tf.keras.layers.Dense(128, activation='relu')(x)
8 ''' The notation (x) at the end of each layer indicates that we are applying the layer to the output of the previous layer. Here’s how this works in detail:
9
10 Layer as a Function: In Keras, layers are callable objects. When you write:
11
12 x = tf.keras.layers.Dense(128, activation='relu')(x)
13 you are effectively treating the dense layer as a function that takes x (the output of the ViT model or the previous dense layer) as input and returns a new tensor as output. This new tensor is then reassigned to x. '''
File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/tf_keras/src/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.__traceback__)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
...
A KerasTensor is symbolic: it's a placeholder for a shape an a dtype. It doesn't have any actual numerical value. You cannot convert it to a NumPy array.
Call arguments received by layer 'keras_layer_7' (type KerasLayer):
• inputs=<KerasTensor shape=(None, 224, 224, 3), dtype=float32, sparse=False, name=keras_tensor_17>
• training=None
遗憾的是,这个错误有点误导。首先,它提到模型的输入存在问题,因此我花了大约一个小时来完成预处理步骤并阅读预训练模型的要求,但我无法解决和修复错误。
经过一段时间的研究并使用不同的资源后,我发现问题在于导入预训练模型的方式,而不是错误中任何解释的步骤。
所以我的代码来自:
vit_model = hub.KerasLayer("https://tfhub.dev/sayakpaul/vit_b16_fe/1", trainable=False)
致:
vit_model = tf.keras.applications.VGG16(include_top=False, input_shape=(224, 224, 3))