我正在尝试使用预训练的 yolov8 模型进行对象检测和跟踪。我能够成功加载它,但由于某种原因,当它检测到对象时,它会随机检测到许多与图像无关的不同对象,所有对象的置信度均为 1.0。
这是我的代码:
from ultralytics import YOLO
import cv2
#load yolov8 model
model = YOLO("yolov8n.pt")
#load video()
video_path = './puppy.mp4'
cap = cv2.VideoCapture(video_path)
ret = True
while ret:
ret,frame = cap.read()
#return a new frame from the video; ret is true if frame is successfully read else is false
if not ret:
break
#detect objects
#track objects
results = model.track(frame,persist=True) #persist= True so YOLO remembers frames it has seen before
#plot results
frame_ = results[0].plot() #creates images for detections
#can also use cv2.rectangle and cv2.putText
#visualize
cv2.imshow('frame',frame_)
if cv2.waitKey(25) & 0xFF==ord('q'):
break
输出:
0: 384x640 3 cars, 33 motorcycles, 21 airplanes, 29 trains, 52 boats, 9 fire hydrants, 7 benchs, 1 suitcase, 37 skiss, 6 snowboards, 1 sports ball, 5 baseball bats, 8 bottles, 1 fork, 20 knifes, 64 spoons, 1 toilet, 1 toaster, 1 toothbrush, 329.4ms
Speed: 4.5ms preprocess, 329.4ms inference, 51.4ms postprocess per image at shape (1, 3, 384, 640)
0: 384x640 1 bicycle, 16 cars, 4 motorcycles, 13 airplanes, 4 buss, 9 trains, 9 boats, 8 traffic lights, 1 fire hydrant, 10 stop signs, 2 benchs, 3 sheeps, 2 cows, 1 umbrella, 2 frisbees, 2 baseball bats, 1 baseball glove, 5 tennis rackets, 2 knifes, 2 spoons, 1 carrot, 3 hot dogs, 5 pizzas, 1 donut, 1 couch, 1 bed, 476.8ms
Speed: 9.3ms preprocess, 476.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)
0: 384x640 1 person, 7 cars, 3 motorcycles, 20 airplanes, 1 bus, 7 trains, 44 boats, 2 fire hydrants, 5 stop signs, 3 parking meters, 2 birds, 3 cows, 2 elephants, 1 bear, 4 umbrellas, 1 handbag, 1 frisbee, 1 knife, 4 spoons, 2 apples, 1 carrot, 1 hot dog, 1 pizza, 1 cake, 1 dining table, 651.2ms
Speed: 15.5ms preprocess, 651.2ms inference, 15.8ms postprocess per image at shape (1, 3, 384, 640)
0: 384x640 7 persons, 3 cars, 1 motorcycle, 2 trains, 13 boats, 2 traffic lights, 2 fire hydrants, 54 stop signs, 2 parking meters, 2 cats, 4 dogs, 1 cow, 18 bears, 1 zebra, 15 giraffes, 11 umbrellas, 14 handbags, 6 ties, 1 frisbee, 1 snowboard, 1 sports ball, 5 baseball gloves, 4 wine glasss, 1 fork, 1 knife, 1 spoon, 1 apple, 1 orange, 1 carrot, 2 pizzas, 1 dining table, 2 laptops, 1 cell phone, 1 sink, 1 hair drier, 656.2ms
Speed: 0.0ms preprocess, 656.2ms inference, 66.9ms postprocess per image at shape (1, 3, 384, 640)
0: 384x640 20 boats, 8 benchs, 5 cats, 4 suitcases, 48 frisbees, 3 skateboards, 20 bowls, 1 couch, 26 beds, 37 laptops, 1 mouse, 15 remotes, 30 cell phones, 27 books, 1057.5ms
Speed: 15.7ms preprocess, 1057.5ms inference, 55.0ms postprocess per image at shape (1, 3, 384, 640)
正如您所看到的,它检测到了许多随机物体,而当时它只是一条狗的视频。 这是我第一次使用 yolo,我对计算机视觉还很陌生。
我还尝试使用终端使用 yolov8n 进行预测,但它仍然检测到许多不同的物体,置信度均为 1.0。 这是我在 CLI 中输入的内容:
yolo predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg'
输出:
Ultralytics YOLOv8.2.75 🚀 Python-3.12.3 torch-2.4.0+cpu CPU (11th Gen Intel Core(TM) i5-1145G7 2.60GHz)
YOLOv8n summary (fused): 168 layers, 3,151,904 parameters, 0 gradients, 8.7 GFLOPs
Downloading https://ultralytics.com/images/bus.jpg to 'bus.jpg'...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 134k/134k [00:00<00:00, 8.86MB/s]
image 1/1 C:\Users\aksha\OneDrive\Documents\Computer_Vision\tutorial_detect\bus.jpg: 640x480 79 persons, 3 bicycles, 46 cars, 46 motorcycles, 15 airplanes, 11 buss, 5 trains, 28 trucks, 49 boats, 4 fire hydrants, 2 parking meters, 2 sports balls, 6 bottles, 4 spoons, 324.2ms
Speed: 15.6ms preprocess, 324.2ms inference, 27.8ms postprocess per image at shape (1, 3, 640, 480)
Results saved to runs\detect\predict
💡 Learn more at https://docs.ultralytics.com/modes/predict
我的 ultralytics、torch 和 np 版本如下: 8.2.75-超解剂 2.4.0+cpu-火炬 1.26.4 - numpy
我不确定问题是什么;我已尝试卸载并重新安装 ultralytics 两次,但仍未解决问题。
答案应该是因为您的对象可能与预训练模型中的对象不同。
我这里有一些建议: