使用 opencv python 进行不稳定缩放

问题描述 投票:0回答:1

我想使用 opencv 对视频应用放大和缩小效果,但由于 opencv 没有内置缩放功能,我尝试将帧裁剪为插值的宽度、高度、x 和 y,然后重新调整帧大小到原始视频尺寸,即 1920 x 1080。

但是当我渲染最终视频时,最终视频出现了抖动。我不确定为什么会发生这种情况,我想要从特定时间完美平滑地放大和缩小

我构建了一个缓动函数,可以为每个帧提供放大和缩小的插值:-

import cv2


video_path = 'inputTest.mp4'
cap = cv2.VideoCapture(video_path)

fps = int(cap.get(cv2.CAP_PROP_FPS)) 
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter('output_video.mp4', fourcc, fps, (1920, 1080))

initialZoomValue={
     'initialZoomWidth': 1920,
    'initialZoomHeight': 1080,
    'initialZoomX': 0,
    'initialZoomY': 0
}

desiredValues = {
     'zoomWidth': 1672,
     'zoomHeight': 941,
     'zoomX': 200,
     'zoomY': 0
}

def ease_out_quart(t):
    return 1 - (1 - t) ** 4

async def zoomInInterpolation(initialZoomValue, desiredZoom, start, end, index):
    t = (index - start) / (end - start)
    eased_t = ease_out_quart(t)

    interpolatedWidth = round(initialZoomValue['initialZoomWidth'] + eased_t * (desiredZoom['zoomWidth']['width'] - initialZoomValue['initialZoomWidth']), 2)
    interpolatedHeight = round(initialZoomValue['initialZoomHeight'] + eased_t * (desiredZoom['zoomHeight'] - initialZoomValue['initialZoomHeight']), 2)
    interpolatedX = round(initialZoomValue['initialZoomX'] + eased_t * (desiredZoom['zoomX'] - initialZoomValue['initialZoomX']), 2)
    interpolatedY = round(initialZoomValue['initialZoomY'] + eased_t * (desiredZoom['zoomY'] - initialZoomValue['initialZoomY']), 2)
    
    return {'interpolatedWidth': int(interpolatedWidth), 'interpolatedHeight': int(interpolatedHeight), 'interpolatedX': int(interpolatedX), 'interpolatedY': int(interpolatedY)}

def generate_frame():
        while cap.isOpened():
            code, frame = cap.read()
            if code:
                yield frame
            else:
                print("bailsdfing")
                break


for i, frame in enumerate(generate_frame()):
   if i >= 1 and i <= 60:
        interpolatedValues = zoomInInterpolation(initialZoomValue, desiredValues, 1, 60, i)
        crop = frame[interpolatedValues['interpolatedY']:(interpolatedValues['interpolatedHeight'] + interpolatedValues['interpolatedY']), interpolatedValues['interpolatedX']:(interpolatedValues['interpolatedWidth'] + interpolatedValues['interpolatedX'])]
        zoomedFrame = cv2.resize(crop,(1920, 1080), interpolation = cv2.INTER_CUBIC) 

        out.write(zoomedFrame)

# Release the video capture and close windows
cap.release()
cv2.destroyAllWindows()

但是我得到的最终视频很震撼:-

最终视频

我希望视频能够完美地放大和缩小,不希望有任何抖动


这是插值图:- GRAPH

如果我不太早对数字进行四舍五入并仅返回整数值,这是一个图表:- GRAPH for Integer return

由于 OpenCV 只接受整数进行裁剪,因此不可能以小数形式从插值函数返回值

python opencv image-processing video-processing
1个回答
0
投票

首先,让我们看看为什么你的方法会令人不安。然后我会向您展示一种不会抖动的替代方案。

在您的方法中,您通过首先裁剪图像然后调整其大小来“缩放”。这种裁剪仅按整个像素行/列进行,而不是按更精细的步骤进行。您在缓动末尾处看到了这一点,图像缩放得非常精细。裁剪后的图像每帧的宽度/高度变化小于一个像素,因此它只会每几帧变化一次。 不要像这样裁剪,而是为每一帧计算并应用一个

变换矩阵

import numpy as np import cv2 as cv from tqdm import tqdm # remove that if you don't like it # Those two functions generate simple translation and scaling matrices: def translate2(tx=0, ty=0): T = np.eye(3) T[0:2, 2] = [tx, ty] return T def scale2(s=1, sx=1, sy=1): T = np.diag([s*sx, s*sy, 1]) return T # you know this one already def ease_out_quart(alpha): return 1 - (1 - alpha) ** 4 # some constants to describe the zoom im = cv.imread(cv.samples.findFile("starry_night.jpg")) (imheight, imwidth) = im.shape[:2] (width, height) = (1280, 720) fps = 60 duration = 5.0 # secs # anchor in center of image anchor = np.array([ (imwidth-1)/2, (imheight-1)/2 ]) # position in center of frame zoom_center = np.array([ (width-1)/2, (height-1)/2 ]) zoom_t_start, zoom_t_end = 1.0, 4.0 zoom_z_start, zoom_z_end = 1.0, 10.0 # calculates the matrix: def calculate_transform(timestamp): alpha = (timestamp - zoom_t_start) / (zoom_t_end - zoom_t_start) alpha = np.clip(alpha, 0, 1) alpha = ease_out_quart(alpha) z = zoom_z_start + alpha * (zoom_z_end - zoom_z_start) T = translate2(*-anchor) T = scale2(s=z) @ T T = translate2(*+zoom_center) @ T return T # applies the matrix: def animation_callback(timestamp, canvas): T = calculate_transform(timestamp) cv.warpPerspective( src=im, M=T, dsize=(output_width, output_height), dst=canvas, # drawing over the same buffer repeatedly flags=cv.INTER_LANCZOS4, # or INTER_LINEAR, INTER_NEAREST, ... ) # generate the video writer = cv.VideoWriter( filename="output.avi", # AVI container: OpenCV built-in fourcc=cv.VideoWriter_fourcc(*"MJPG"), # MJPEG codec: OpenCV built-in fps=fps, frameSize=(output_width, output_height), isColor=True ) assert writer.isOpened() canvas = np.zeros((output_height, output_width, 3), dtype=np.uint8) timestamps = np.arange(0, duration * fps) / fps try: for timestamp in tqdm(timestamps): animation_callback(timestamp, canvas) writer.write(canvas) cv.imshow("frame", canvas) key = cv.waitKey(1) if key in (13, 27): break finally: cv.destroyWindow("frame") writer.release() print("done")

这是两个结果视频:

https://imgur.com/a/mDfrpre

© www.soinside.com 2019 - 2024. All rights reserved.