警告:tensorflow:使用 while_loop 进行转换,因为没有为此操作注册的转换器

问题描述 投票:0回答:6

我正在训练 Keras 模型,我需要切换设备以获得更多功能(从 Windows i3 core 到 Ubuntu i7)。问题是,我的代码在 Windows 上运行良好,但显示以下错误,该错误甚至在运行第一个纪元之前就停止了计算。 这是完整的输出:

/home/willylutz/PycharmProjects/hiv_image_analysis/venv/bin/python /home/willylutz/PycharmProjects/hiv_image_analysis/main.py 
2022-09-19 09:36:52.801711: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-19 09:36:52.956260: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-09-19 09:36:53.502748: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/willylutz/PycharmProjects/hiv_image_analysis/venv/lib/python3.8/site-packages/cv2/../../lib64:
2022-09-19 09:36:53.502794: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/willylutz/PycharmProjects/hiv_image_analysis/venv/lib/python3.8/site-packages/cv2/../../lib64:
2022-09-19 09:36:53.502800: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
0
Found 480 files belonging to 2 classes.
Using 384 files for training.
2022-09-19 09:37:00.058171: E tensorflow/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-09-19 09:37:00.058202: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (zhang): /proc/driver/nvidia/version does not exist
2022-09-19 09:37:00.067388: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Found 480 files belonging to 2 classes.
Using 96 files for validation.
['INF', 'NI']
2022-09-19 09:37:10.149236: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:390] Filling up shuffle buffer (this may take a while): 373 of 512
2022-09-19 09:37:10.197351: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:415] Shuffle buffer filled.
(64, 1024, 1024, 3)
(64,)
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 sequential (Sequential)     (None, 1024, 1024, 3)     0         
                                                                 
 rescaling (Rescaling)       (None, 1024, 1024, 3)     0         
                                                                 
 conv2d (Conv2D)             (None, 1024, 1024, 16)    448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 512, 512, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 512, 512, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 256, 256, 32)     0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 256, 256, 64)      18496     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 128, 128, 64)     0         
 2D)                                                             
                                                                 
 dropout (Dropout)           (None, 128, 128, 64)      0         
                                                                 
 flatten (Flatten)           (None, 1048576)           0         
                                                                 
 dense (Dense)               (None, 128)               134217856 
                                                                 
 outputs (Dense)             (None, 2)                 258       
                                                                 
=================================================================
Total params: 134,241,698
Trainable params: 134,241,698
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
2022-09-19 09:37:16.377367: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 4294967296 exceeds 10% of free system memory.

Process finished with exit code 137 (interrupted by signal 9: SIGKILL)

如果需要,我也可以放置我的代码,但我觉得这不是问题。 感谢您的帮助。

python tensorflow keras deep-learning
6个回答
8
投票

keras/tensorflow 2.9 和 2.10 中存在一个 bug,导致像重新缩放这样的预处理层极其缓慢: https://github.com/tensorflow/tensorflow/issues/56242

尝试没有重新缩放层的模型。如果您想使用此或类似的预处理层,您应该使用 TF 2.8.3 或更早版本。

我不确定以下内容,但是

Could not load dynamic library 'libnvinfer.so.7'

似乎表明还有其他一些问题,可能是tensorflow / keras / cuda版本错误。


1
投票

我遇到了同样的问题。我通过将源代码从 keras Rescaling 层复制到我的代码中来解决这个问题,并制作了我的“自己的”Rescaling 类。我将 math_ops.cast() 更改为 tf.cast(),它的效果非常好。没有警告,代码运行得更快。

 class Rescaling(tf.keras.layers.Layer):
"""Multiply inputs by `scale` and adds `offset`.
For instance:
1. To rescale an input in the `[0, 255]` range
to be in the `[0, 1]` range, you would pass `scale=1./255`.
2. To rescale an input in the `[0, 255]` range to be in the `[-1, 1]` 
range,
you would pass `scale=1./127.5, offset=-1`.
The rescaling is applied both during training and inference.
Input shape:
Arbitrary.
Output shape:
Same as input.
Arguments:
scale: Float, the scale to apply to the inputs.
offset: Float, the offset to apply to the inputs.
name: A string, the name of the layer.
"""

def __init__(self, scale, offset=0., name=None, **kwargs):
  self.scale = scale
  self.offset = offset
  super(Rescaling, self).__init__(name=name, **kwargs)

def call(self, inputs):
  dtype = self._compute_dtype
  scale = tf.cast(self.scale, dtype)
  offset = tf.cast(self.offset, dtype)
  return tf.cast(inputs, dtype) * scale + offset

def compute_output_shape(self, input_shape):
  return input_shape

def get_config(self):
  config = {
      'scale': self.scale,
      'offset': self.offset,
  }
  base_config = super(Rescaling, self).get_config()
  return dict(list(base_config.items()) + list(config.items()))

1
投票

我也出现过同样的错误。可能的原因是您用于增强目的的预处理层导致了此错误。


0
投票

这两个警告:

Could not load dynamic library 'libnvinfer.so.7'
libnvinfer_plugin.so.7: cannot open shared object file

可以安全地忽略。

NVidia 有一个名为 TensorRT 的自定义神经网络框架。

简而言之,TensorRT 对内部模型图进行优化,这意味着模型执行得更快。通常,您首先将张量流模型保存到onnx,然后将模型从onnx转换为TensorRT。最近,TF 和 NVidia 的进步使得在推理模式下运行时(在具有 NVidia GPU 的计算机上运行时)可以利用这一点更快地执行张量流模型。

以前,您需要从源代码构建 TF 才能启用 TensorRT 集成。似乎在最新版本中,它现在默认启用(我想在降级到 tf v2.8 之前,您使用的是 tf v2.11)。

但是,要使用 TensorRT 优化运行 TF 模型,您需要在 PC 上安装 NVidia 的某些驱动程序和库。因为您没有安装它们,TF 会警告您找不到它们。

如顶部所述,您可以安全地忽略该警告,它不会以任何方式影响您的训练或 TF 表现。


0
投票

数据增强是这个问题的原因之一。我注释了旋转选项(layers.experimental.preprocessing.RandomRotation(0.2)并再次尝试,警告消失了。


0
投票

我对 autokeras 也有同样的问题。第一次训练总是效果很好,然后我收到这些警告,从那里开始训练变得非常慢。我尝试返回 Tensorflow 2.8.3 和 Keras 2.8,但它引发了另一个问题(无法在 tf 实验库中找到某些内容)。 还在 GPU 上尝试过 TF 2.10,但由于很快就面临内存不足问题而放弃了它。 又尝试了最新的TF和Keras,现象是一样的,只是没有警告。第一次训练后,训练一个周期从 2-3 分钟缩短到 2-3 小时。

您使用什么版本的 Keras?

© www.soinside.com 2019 - 2024. All rights reserved.