Jetson Nano CNN 推理性能较低

问题描述 投票:0回答:1

我正在使用 Jetpack 4.4 的 nVidia Jetson Nano 上运行一个我认为相当小的 CNN。 nVidia 声称 Nano 可以以 36 fps 的速度运行 ResNet-50,因此我希望我的小得多的网络能够轻松以 30+ fps 的速度运行。

实际上,每次前传需要 160-180 毫秒,所以我最多只能获得 5-6 fps。

我的CNN:

Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= lambda (Lambda) (None, 210, 848, 3) 0 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 210, 282, 3) 0 _________________________________________________________________ conv2d (Conv2D) (None, 102, 138, 16) 2368 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 51, 69, 16) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 24, 33, 32) 12832 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 12, 16, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 4, 6, 64) 51264 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 2, 3, 64) 0 _________________________________________________________________ flatten (Flatten) (None, 384) 0 _________________________________________________________________ dropout (Dropout) (None, 384) 0 _________________________________________________________________ dense (Dense) (None, 64) 24640 _________________________________________________________________ dropout_1 (Dropout) (None, 64) 0 _________________________________________________________________ elu (ELU) (None, 64) 0 _________________________________________________________________ dense_1 (Dense) (None, 1) 65 ================================================================= Total params: 91,169 Trainable params: 91,169 Non-trainable params: 0 _________________________________________________________________
代码:

import numpy as np import cv2 import time import tensorflow as tf from tensorflow import keras model_name = 'v9_small_FC_epoch_3' loaded_model = keras.models.load_model('/home/jetson/notebooks/trained_models/' + model_name + '.h5') loaded_model.summary() frame = cv2.imread('/home/jetson/notebooks/frame1.jpg') test_data = np.expand_dims(frame, axis=0) for i in range(10): start = time.time() predictions = loaded_model.predict(test_data) print(predictions[0][0]) end = time.time() print("Inference took {}s".format(end-start))
结果:

4.7763316333293915 Inference took 10.111131191253662s 4.7763316333293915 Inference took 0.1822071075439453s 4.7763316333293915 Inference took 0.17330455780029297s 4.7763316333293915 Inference took 0.18085694313049316s 4.7763316333293915 Inference took 0.16646790504455566s 4.7763316333293915 Inference took 0.1703803539276123s 4.7763316333293915 Inference took 0.1788337230682373s 4.7763316333293915 Inference took 0.17131853103637695s 4.7763316333293915 Inference took 0.1660606861114502s 4.7763316333293915 Inference took 0.18377089500427246s
    
keras performance
1个回答
0
投票
使用jetson nano更轻的模型,推理的延迟是明显的,因为jetson无法处理大型模型。尝试

减少 CNN 的层数。另一种解决方案可能是禁用 jetson 的 GUI 界面以释放 CNN 的 RAM。

在实时中,您无法使用此模型,因为它的推理延迟

© www.soinside.com 2019 - 2024. All rights reserved.