我目前正在将我的 Android 流媒体应用程序移植到 Windows,为了解码 h264 视频流,我使用 FFmpeg 并可能进行硬件加速。过去两周,我阅读了大量文档并研究了互联网上的大量示例。对于我的项目,我使用 JavaCV,它在内部使用 FFmpeg 5.1.2。在 Windows 上,我支持 D3D11VA、DXVA2 和 Cuvid 进行硬件加速(以及作为后备的软件解码)。在测试过程中,我注意到在使用 D3D11VA 或 DXVA2 硬件加速时我得到了一些奇怪的伪像。经过进一步调查,我发现我收到了很多
"处理输入时发现无效数据"
调用
avcodec_send_packet
时出错。似乎这个错误只发生在某些关键帧上。该错误始终可重现。软件解码器或cuvid解码器处理和解码这样的帧绝对没有问题,所以不确定为什么帧中会有无效数据?我在解码器配置上玩了很多,但似乎没有任何帮助,那时我认为这绝对不是正常行为。
我提供了一个可重现的例子,可以从这里下载。所有重要的部分都在 App.java 类中。此外,下面还发布了一个代码示例。该示例试图解码关键帧。 sps 和 pps 的关键帧数据是从项目资源文件夹中的文件中读取的。
要运行项目,只需执行.\gradlew build,然后执行.\gradlew run。如果您运行该示例,终端中显示的最后一条日志消息应该是“SUCESS with HW decoding”。可以通过 App.java 类中的 HW_DEVICE_TYPE 变量更改硬件解码器。要禁用硬件加速,只需将 USE_HW_ACCEL 设置为 false。
对我来说,一切似乎都是正确的,我不知道代码有什么问题。我在互联网上看了很多以找到问题的根本原因,但我并没有真正找到解决方案,而是与(可能)相同问题相关的其他来源
https://www.mail-archive.com/[email protected]/...
https://stackoverflow.com/questions/67307397/ffmpeg-...
我还在 Windows 上发现了另一个可以使用 D3D11VA 和 DXVA2 硬件加速的流媒体应用程序,称为 Chiaki(它需要 PS4 或 PS5),它似乎有完全相同的问题。我使用了 here 提供的构建。 It will fail to decode certain key frames as well when hardware acceleration with D3D11VA or DXVA2 is selected (e.g. the first key frame received by the stream). Chiaki 可以输出看似错误的帧,但我的示例也可以通过将 USE_AV_EF_EXPLODE 设置为 false 来实现。
周围是否有任何 ffmpeg 专家可以检查 D3D11VA 或 DXVA2 有什么问题?使 D3D11VA 和 DXVA2 硬件解码器工作还需要做些什么吗?我现在完全没有想法,我什至不确定这是否可以解决。
我的测试机器上安装了 Windows 11,并且安装了最新的 Nvidea 驱动程序。
编辑:这里是我的项目的一个缩小的完整示例(包含 sps 和 pps 的关键帧文件可以从here 下载。它是一个十六进制字符串文件,可以使用提供的 HexUtil 类进行解码)
import javafx.application.Application;
import javafx.scene.Scene;
import javafx.scene.layout.Pane;
import javafx.stage.Stage;
import org.bytedeco.ffmpeg.avcodec.AVCodec;
import org.bytedeco.ffmpeg.avcodec.AVCodecContext;
import org.bytedeco.ffmpeg.avcodec.AVCodecHWConfig;
import org.bytedeco.ffmpeg.avcodec.AVPacket;
import org.bytedeco.ffmpeg.avutil.AVBufferRef;
import org.bytedeco.ffmpeg.avutil.AVDictionary;
import org.bytedeco.ffmpeg.avutil.AVFrame;
import org.bytedeco.javacpp.BytePointer;
import org.bytedeco.javacpp.IntPointer;
import org.bytedeco.javacv.FFmpegLogCallback;
import org.tinylog.Logger;
import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.StandardCharsets;
import java.util.Objects;
import java.util.function.Consumer;
import static org.bytedeco.ffmpeg.avcodec.AVCodecContext.AV_EF_EXPLODE;
import static org.bytedeco.ffmpeg.avcodec.AVCodecContext.FF_THREAD_SLICE;
import static org.bytedeco.ffmpeg.global.avcodec.*;
import static org.bytedeco.ffmpeg.global.avutil.*;
public class App extends Application {
/**** decoder variables ****/
private AVHWContextInfo hardwareContext;
private AVCodec decoder;
private AVCodecContext m_VideoDecoderCtx;
private AVCodecContext.Get_format_AVCodecContext_IntPointer formatCallback;
private final int streamResolutionX = 1920;
private final int streamResolutionY = 1080;
// AV_HWDEVICE_TYPE_CUDA // example works with cuda
// AV_HWDEVICE_TYPE_DXVA2 // producing Invalid data found on keyframe
// AV_HWDEVICE_TYPE_D3D11VA // producing Invalid data found on keyframe
private static final int HW_DEVICE_TYPE = AV_HWDEVICE_TYPE_DXVA2;
private static final boolean USE_HW_ACCEL = true;
private static final boolean USE_AV_EF_EXPLODE = true;
public static void main(final String[] args) {
//System.setProperty("prism.order", "d3d,sw");
System.setProperty("prism.vsync", "false");
Application.launch(App.class);
}
@Override
public void start(final Stage primaryStage) {
final Pane dummyPane = new Pane();
dummyPane.setStyle("-fx-background-color: black");
final Scene scene = new Scene(dummyPane, this.streamResolutionX, this.streamResolutionY);
primaryStage.setScene(scene);
primaryStage.show();
primaryStage.setMinWidth(480);
primaryStage.setMinHeight(360);
this.initializeFFmpeg(result -> {
if (!result) {
Logger.error("FFmpeg could not be initialized correctly, terminating program");
System.exit(1);
return;
}
this.performTestFramesFeeding();
});
}
private void initializeFFmpeg(final Consumer<Boolean> finishHandler) {
FFmpegLogCallback.setLevel(AV_LOG_DEBUG); // Increase log level until the first frame is decoded
//FFmpegLogCallback.setLevel(AV_LOG_QUIET);
this.decoder = avcodec_find_decoder(AV_CODEC_ID_H264); // usually decoder name is h264 and without hardware support it's yuv420p otherwise nv12
if (this.decoder == null) {
Logger.error("Unable to find decoder for format {}", "h264");
finishHandler.accept(false);
return;
}
Logger.info("Current decoder name: {}, {}", this.decoder.name().getString(), this.decoder.long_name().getString());
if (true) {
for (; ; ) {
this.m_VideoDecoderCtx = avcodec_alloc_context3(this.decoder);
if (this.m_VideoDecoderCtx == null) {
Logger.error("Unable to find decoder for format AV_CODEC_ID_H264");
if (this.hardwareContext != null) {
this.hardwareContext.free();
this.hardwareContext = null;
}
continue;
}
if (App.USE_HW_ACCEL) {
this.hardwareContext = this.createHardwareContext();
if (this.hardwareContext != null) {
Logger.info("Set hwaccel support");
this.m_VideoDecoderCtx.hw_device_ctx(this.hardwareContext.hwContext()); // comment to disable hwaccel
}
} else {
Logger.info("Hwaccel manually disabled");
}
// Always request low delay decoding
this.m_VideoDecoderCtx.flags(this.m_VideoDecoderCtx.flags() | AV_CODEC_FLAG_LOW_DELAY);
// Allow display of corrupt frames and frames missing references
this.m_VideoDecoderCtx.flags(this.m_VideoDecoderCtx.flags() | AV_CODEC_FLAG_OUTPUT_CORRUPT);
this.m_VideoDecoderCtx.flags2(this.m_VideoDecoderCtx.flags2() | AV_CODEC_FLAG2_SHOW_ALL);
if (App.USE_AV_EF_EXPLODE) {
// Report decoding errors to allow us to request a key frame
this.m_VideoDecoderCtx.err_recognition(this.m_VideoDecoderCtx.err_recognition() | AV_EF_EXPLODE);
}
// Enable slice multi-threading for software decoding
if (this.m_VideoDecoderCtx.hw_device_ctx() == null) { // if not hw accelerated
this.m_VideoDecoderCtx.thread_type(this.m_VideoDecoderCtx.thread_type() | FF_THREAD_SLICE);
this.m_VideoDecoderCtx.thread_count(2/*AppUtil.getCpuCount()*/);
} else {
// No threading for HW decode
this.m_VideoDecoderCtx.thread_count(1);
}
this.m_VideoDecoderCtx.width(this.streamResolutionX);
this.m_VideoDecoderCtx.height(this.streamResolutionY);
this.m_VideoDecoderCtx.pix_fmt(this.getDefaultPixelFormat());
this.formatCallback = new AVCodecContext.Get_format_AVCodecContext_IntPointer() {
@Override
public int call(final AVCodecContext context, final IntPointer pixelFormats) {
final boolean hwDecodingSupported = context.hw_device_ctx() != null && App.this.hardwareContext != null;
final int preferredPixelFormat = hwDecodingSupported ?
App.this.hardwareContext.hwConfig().pix_fmt() :
context.pix_fmt();
int i = 0;
while (true) {
final int currentSupportedFormat = pixelFormats.get(i++);
System.out.println("Supported pixel formats " + currentSupportedFormat);
if (currentSupportedFormat == preferredPixelFormat) {
Logger.info("[FFmpeg]: pixel format in format callback is {}", currentSupportedFormat);
return currentSupportedFormat;
}
if (currentSupportedFormat == AV_PIX_FMT_NONE) {
break;
}
}
i = 0;
while (true) { // try again and search for yuv
final int currentSupportedFormat = pixelFormats.get(i++);
if (currentSupportedFormat == AV_PIX_FMT_YUV420P) {
Logger.info("[FFmpeg]: Not found in first match so use {}", AV_PIX_FMT_YUV420P);
return currentSupportedFormat;
}
if (currentSupportedFormat == AV_PIX_FMT_NONE) {
break;
}
}
i = 0;
while (true) { // try again and search for nv12
final int currentSupportedFormat = pixelFormats.get(i++);
if (currentSupportedFormat == AV_PIX_FMT_NV12) {
Logger.info("[FFmpeg]: Not found in second match so use {}", AV_PIX_FMT_NV12);
return currentSupportedFormat;
}
if (currentSupportedFormat == AV_PIX_FMT_NONE) {
break;
}
}
Logger.info("[FFmpeg]: pixel format in format callback is using fallback {}", AV_PIX_FMT_NONE);
return AV_PIX_FMT_NONE;
}
};
this.m_VideoDecoderCtx.get_format(this.formatCallback);
final AVDictionary options = new AVDictionary(null);
final int result = avcodec_open2(this.m_VideoDecoderCtx, this.decoder, options);
if (result < 0) {
Logger.error("avcodec_open2 was not successful");
finishHandler.accept(false);
return;
}
av_dict_free(options);
break;
}
}
if (this.decoder == null || this.m_VideoDecoderCtx == null) {
finishHandler.accept(false);
return;
}
finishHandler.accept(true);
}
private AVHWContextInfo createHardwareContext() {
AVHWContextInfo result = null;
for (int i = 0; ; i++) {
final AVCodecHWConfig config = avcodec_get_hw_config(this.decoder, i);
if (config == null) {
break;
}
if ((config.methods() & AV_CODEC_HW_CONFIG_METHOD_HW_DEVICE_CTX) < 0) {
continue;
}
final int device_type = config.device_type();
if (device_type != App.HW_DEVICE_TYPE) {
continue;
}
final AVBufferRef hw_context = av_hwdevice_ctx_alloc(device_type);
if (hw_context == null || av_hwdevice_ctx_create(hw_context, device_type, (String) null, null, 0) < 0) {
Logger.error("HW accel not supported for type {}", device_type);
av_free(config);
av_free(hw_context);
} else {
Logger.info("HW accel created for type {}", device_type);
result = new AVHWContextInfo(config, hw_context);
}
break;
}
return result;
}
@Override
public void stop() {
this.releaseNativeResources();
}
/************************/
/*** video processing ***/
/************************/
private void performTestFramesFeeding() {
final AVPacket pkt = av_packet_alloc();
if (pkt == null) {
return;
}
try (final BytePointer bp = new BytePointer(65_535 * 4)) {
final byte[] frameData = AVTestFrames.h264KeyTestFrame;
bp.position(0);
bp.put(frameData);
bp.limit(frameData.length);
pkt.data(bp);
pkt.capacity(bp.capacity());
pkt.size(frameData.length);
pkt.position(0);
pkt.limit(frameData.length);
final AVFrame avFrame = av_frame_alloc();
final int err = avcodec_send_packet(this.m_VideoDecoderCtx, pkt); // this will fail with D3D11VA and DXVA2
if (err < 0) {
final BytePointer buffer = new BytePointer(512);
av_strerror(err, buffer, buffer.capacity());
final String string = buffer.getString();
System.out.println("Error on decoding test frame " + err + " message " + string);
av_frame_free(avFrame);
return;
}
final int result = avcodec_receive_frame(this.m_VideoDecoderCtx, avFrame);
final AVFrame decodedFrame;
if (result == 0) {
if (this.m_VideoDecoderCtx.hw_device_ctx() == null) {
decodedFrame = avFrame;
av_frame_unref(decodedFrame);
System.out.println("SUCESS with SW decoding");
} else {
final AVFrame hwAvFrame = av_frame_alloc();
if (av_hwframe_transfer_data(hwAvFrame, avFrame, 0) < 0) {
System.out.println("Failed to transfer frame from hardware");
av_frame_unref(hwAvFrame);
decodedFrame = avFrame;
} else {
av_frame_unref(avFrame);
decodedFrame = hwAvFrame;
System.out.println("SUCESS with HW decoding");
}
av_frame_unref(decodedFrame);
}
} else {
final BytePointer buffer = new BytePointer(512);
av_strerror(result, buffer, buffer.capacity());
final String string = buffer.getString();
System.out.println("error " + result + " message " + string);
av_frame_free(avFrame);
}
} finally {
if (pkt.stream_index() != -1) {
av_packet_unref(pkt);
}
pkt.releaseReference();
}
}
final Object releaseLock = new Object();
private volatile boolean released = false;
private void releaseNativeResources() {
if (this.released) {
return;
}
this.released = true;
synchronized (this.releaseLock) {
// Close the video codec
if (this.m_VideoDecoderCtx != null) {
avcodec_free_context(this.m_VideoDecoderCtx);
this.m_VideoDecoderCtx = null;
}
// close the format callback
if (this.formatCallback != null) {
this.formatCallback.close();
this.formatCallback = null;
}
// close hw context
if (this.hardwareContext != null) {
this.hardwareContext.free();
}
}
}
private int getDefaultPixelFormat() {
return AV_PIX_FMT_YUV420P; // Always return yuv420p here
}
public static final class HexUtil {
private static final char[] hexArray = "0123456789ABCDEF".toCharArray();
private HexUtil() {
}
public static String hexlify(final byte[] bytes) {
final char[] hexChars = new char[bytes.length * 2];
for (int j = 0; j < bytes.length; ++j) {
final int v = bytes[j] & 255;
hexChars[j * 2] = HexUtil.hexArray[v >>> 4];
hexChars[j * 2 + 1] = HexUtil.hexArray[v & 15];
}
return new String(hexChars);
}
public static byte[] unhexlify(final String argbuf) {
final int arglen = argbuf.length();
if (arglen % 2 != 0) {
throw new RuntimeException("Odd-length string");
} else {
final byte[] retbuf = new byte[arglen / 2];
for (int i = 0; i < arglen; i += 2) {
final int top = Character.digit(argbuf.charAt(i), 16);
final int bot = Character.digit(argbuf.charAt(i + 1), 16);
if (top == -1 || bot == -1) {
throw new RuntimeException("Non-hexadecimal digit found");
}
retbuf[i / 2] = (byte) ((top << 4) + bot);
}
return retbuf;
}
}
}
public static final class AVHWContextInfo {
private final AVCodecHWConfig hwConfig;
private final AVBufferRef hwContext;
private volatile boolean freed = false;
public AVHWContextInfo(final AVCodecHWConfig hwConfig, final AVBufferRef hwContext) {
this.hwConfig = hwConfig;
this.hwContext = hwContext;
}
public AVCodecHWConfig hwConfig() {
return this.hwConfig;
}
public AVBufferRef hwContext() {
return this.hwContext;
}
public void free() {
if (this.freed) {
return;
}
this.freed = true;
av_free(this.hwConfig);
av_free(this.hwContext);
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
AVHWContextInfo that = (AVHWContextInfo) o;
return freed == that.freed && Objects.equals(hwConfig, that.hwConfig) && Objects.equals(hwContext, that.hwContext);
}
@Override
public int hashCode() {
return Objects.hash(hwConfig, hwContext, freed);
}
@Override
public String toString() {
return "AVHWContextInfo[" +
"hwConfig=" + this.hwConfig + ", " +
"hwContext=" + this.hwContext + ']';
}
}
public static final class AVTestFrames {
private AVTestFrames() {
}
static {
InputStream inputStream = null;
try {
inputStream = AVTestFrames.class.getClassLoader().getResourceAsStream("h264_test_key_frame.txt");
final byte[] h264TestFrameBuffer = inputStream == null ? new byte[0] : inputStream.readAllBytes();
final String h264TestFrame = new String(h264TestFrameBuffer, StandardCharsets.UTF_8);
AVTestFrames.h264KeyTestFrame = HexUtil.unhexlify(h264TestFrame);
} catch (final IOException e) {
Logger.error(e, "Could not parse test frame");
} finally {
if (inputStream != null) {
try {
inputStream.close();
inputStream = null;
} catch (final IOException e) {
Logger.error(e, "Could not close test frame input stream");
}
}
}
}
public static byte[] h264KeyTestFrame;
}
}
项目的构建 gradle 看起来像这样
plugins {
id 'application'
id 'org.openjfx.javafxplugin' version '0.0.13'
}
group 'com.test.example'
version '1.0.0'
repositories {
mavenCentral()
mavenLocal()
maven { url 'https://jitpack.io' }
}
dependencies {
implementation group: 'org.bytedeco', name: 'javacv-platform', version: '1.5.8'
implementation group: 'com.github.oshi', name: 'oshi-core', version: '3.4.3'
implementation 'org.tinylog:tinylog-api:2.1.0'
implementation 'org.tinylog:tinylog-impl:2.1.0'
implementation 'org.jcodec:jcodec:0.2.5'
}
test {
useJUnitPlatform()
}
javafx {
version = '17.0.6'
modules = ['javafx.graphics', 'javafx.controls', 'javafx.fxml', 'javafx.base']
}
mainClassName = 'com.test.example.App'
经过无数个小时的调试、网上搜索和阅读ffmpeg源代码,我终于找到了问题所在。这确实是ffmpeg当前源代码的一个限制(直到今天6.0.0版本仍然存在)
我通过 javacv 使用 ffmpeg,我在那里合并了一个 ffmpeg 补丁,它解决了这个问题。似乎在 h264dec.h 中有一个名为 MAX_SLICES 的变量,它被设置为 32。这里使用了这个 MAX_SLICES 值 dxva2_h264.c。我还在 github 上发现了这个有趣的错误报告,它似乎与那个问题有关
https://github.com/wang-bin/QtAV/issues/923
我在 javacv 上的拉取请求可以在这里找到。因此,如果任何人遇到 D3D11VA 和 DXVA2 硬件加速解码的相同问题,请检查您要解码的帧有多少片,如果超过 32 片并且您使用的是未修改版本的 ffmpeg,则解码将失败。我不知道为什么支持的切片在代码中设置得那么低。软件解码和使用 Nvidea cuvid 解码器不受此限制。