我正在尝试在 Windows 中使用 ffplay 播放 USB 网络摄像头流(不确定它是什么格式..)。我可以毫无问题地观看视频,但我在控制台中不断收到以下错误。
ffplay.exe -f dshow -i video="罗技高清网络摄像头 C615" -loglevel debug
[mjpeg @97a118cc80] 无法解码 APP 字段:处理输入时发现无效数据
我真的需要担心这个错误吗?或者我需要在命令中提供的任何过滤器来解决此错误。
注意:我尝试使用 ffmpeg 将流保存到文件中,但遇到了同样的问题。
提前致谢。
那些APP字段消息不是错误。您所看到的是罗技专有的 Motion-jpeg 格式,他们在许多网络摄像机中使用该格式。例如,我在 C270 和较新的 c922 中看到过它。 mjpeg流包含一系列jpeg图像,有些是关键帧,完整的图像,有些是其他帧如增量帧,描述帧之间的差异。 Logitech 所做的是将 H264 数据作为 APP 附件附加到 jpeg 帧上,从而将 H264 流嵌入到 mjpeg 流中,即它是流中的流。当您播放或转码 mjpeg 流中的数据时,ffmpeg 会遇到这些 APP 附件,并且不知道如何处理它们。我相信像 Skype 这样的程序能够读取外部 mjpeg 流和内部 H264 流。
如果您想亲眼看到这一点,您可以从摄像机的 mjpeg 流中编码一个小视频,然后提取 jpeg 图像,然后查看 jpeg 图像的结构,您将看到嵌入的视频。
# create a small mp4, copying mjpeg stream off the cam for a second or two
$ ffmpeg -f v4l2 -input_format mjpeg -i /dev/video0 -c:v copy test.mp4
# extract the unaltered jpeg files inside the stream
$ ffmpeg -i test.mp4 -vcodec copy %03d.jpg
# view any of the jpeg files for APP attachments
$ exiv2 -pS 001.jpg
STRUCTURE OF JPEG FILE: 001.jpg address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 33 | AVI1.....x.x....................
37 | 0xffdb DQT | 67
106 | 0xffdb DQT | 67
175 | 0xffdd DRI | 4
181 | 0xffe0 APP0 | 4 | ....
187 | 0xffc0 SOF0 | 17
206 | 0xffda SOS
看到 jpeg 上的那些 APP0 附件了吗?这就是解码器/播放器抱怨的嵌入式 H264 数据。
虽然 Phil 指出这些是嵌入的 H.264 帧,但我不知道他是如何推断出来的,但我提取了 APP0 段并尝试将其解析为原始 H.264,但它没有解码。
$ exiv2 -pS Logitech-C270-003.jpg
STRUCTURE OF JPEG FILE: Logitech-C270-003.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 33 | AVI1.....x.x..................
37 | 0xffdb DQT | 67
106 | 0xffdb DQT | 67
175 | 0xffdd DRI | 4
181 | 0xffe0 APP0 | 4 | .
187 | 0xffc0 SOF0 | 17
206 | 0xffda SOS
$ dd if=Logitech-C270-003.jpg bs=1 skip=6 count=31 of=Logitech-C270-003.h264
33+0 records in
33+0 records out
33 bytes copied, 0.000154041 s, 214 kB/s
$ ffplay -f h264 -i Logitech-C270-003.h264
[h264 @ 0x7f1794009d00] missing picture in access unit with size 31
[extract_extradata @ 0x7f1794021a40] No start code is found.
Logitech-C270-003.h264: could not find codec parameters
我注意到的另一个异常是,每个 MJPEG 帧都包含长度为 33 的 APP0 段(与他的相同),我发现这与他关于流由关键帧和增量帧组成的断言不一致。
$ exiv2 -pS Logitech-C270-001.jpg
STRUCTURE OF JPEG FILE: Logitech-C270-001.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 33 | AVI1.....x.x..................
37 | 0xffdb DQT | 67
106 | 0xffdb DQT | 67
175 | 0xffdd DRI | 4
181 | 0xffe0 APP0 | 4 | .
187 | 0xffc0 SOF0 | 17
206 | 0xffda SOS
$ exiv2 -pS Logitech-C270-002.jpg
STRUCTURE OF JPEG FILE: Logitech-C270-002.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 33 | AVI1.....x.x..................
37 | 0xffdb DQT | 67
106 | 0xffdb DQT | 67
175 | 0xffdd DRI | 4
181 | 0xffe0 APP0 | 4 | .
187 | 0xffc0 SOF0 | 17
206 | 0xffda SOS
$ exiv2 -pS Logitech-C270-003.jpg
STRUCTURE OF JPEG FILE: Logitech-C270-003.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 33 | AVI1.....x.x..................
37 | 0xffdb DQT | 67
106 | 0xffdb DQT | 67
175 | 0xffdd DRI | 4
181 | 0xffe0 APP0 | 4 | .
187 | 0xffc0 SOF0 | 17
206 | 0xffda SOS
...
因为如果我们假设,那么每个关键帧都有一个与之关联的增量帧,但这没有意义。而且,同时编码 H.264 和 MJPEG 也没有意义。
进一步检查 APP0 段,我们发现它“几乎”是一个兼容的 JFIF 文件。
$ xxd Logitech-C270-003.h264
00000000: ffe0 0021 4156 4931 0001 0101 0078 0078 ...!AVI1.....x.x
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0000 00 ...
让我们来分解一下。
FF E0
是 APP0 标记。
00 21
是标记段和大小字段本身的大小(16 位大端),33
为十进制。唯一显着的异常是 AVI1
(和一个空字节)代替了 JFIF(和一个空字节),这可能是 FourCC。快速检查 FFmpeg 源代码证实了这一点,也这个。我不知道为什么它在这里。最好的猜测是表明它是一个 MJPEG 文件。接下来的两个字节
01 01
分别表示版本,主要版本和次要版本,转换为 JFIF 版本 1.02。下一个字节 01
表示 DPI 单位。接下来的两个字节00 78
(再次重复)表示 120 DPI。接下来的两个字节00 00
分别表示缩略图的大小、宽度和高度,表示缩略图的缺失。其余的可能是填充空字节。APP0 标记是半合规的。作为参考,这是由 FFmpeg 编码的兼容 JFIF 文件的 APP0 标记。
$ ffmpeg -i Logitech-C270-003.jpg -bsf:v mjpeg2jpeg Logitech-C270-003-duplicate.jpg
...
[mjpeg @ 0x5d3e1cea4680] unable to decode APP fields: Invalid data found when processing input
Input #0, image2, from 'Logitech-C270-003.jpg':
Duration: 00:00:00.04, start: 0.000000, bitrate: 9521 kb/s
Stream #0:0: Video: mjpeg (Baseline), yuvj422p(pc, bt470bg/unknown/unknown), 1280x720, 25 fps, 25 tbr, 25 tbn
Stream mapping:
Stream #0:0 -> #0:0 (mjpeg (native) -> mjpeg (native))
Press [q] to stop, [?] for help
[mjpeg @ 0x5d3e1ceab600] unable to decode APP fields: Invalid data found when processing input
Output #0, image2, to 'Logitech-C270-003-duplicate.jpg':
Metadata:
encoder : Lavf60.16.100
Stream #0:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown, progressive), 1280x720, q=2-31, 200 kb/s, 25 fps, 25 tbn
Metadata:
encoder : Lavc60.31.102 mjpeg
Side data:
cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: N/A
[image2 @ 0x5d3e1ceab9c0] The specified filename 'Logitech-C270-003-duplicate.jpg' does not contain an image sequence pattern or a pattern is invalid.
[image2 @ 0x5d3e1ceab9c0] Use a pattern such as %03d for an image sequence or use the -update option (with -frames:v 1 if needed) to write a single image.
[out#0/image2 @ 0x5d3e1cea64c0] video:40kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
frame= 1 fps=0.0 q=5.8 Lsize=N/A time=00:00:00.00 bitrate=N/A speed= 0x
$ exiv2 -pS Logitech-C270-003-duplicate.jpg
STRUCTURE OF JPEG FILE: Logitech-C270-003-duplicate.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 16 | JFIF.........
20 | 0xffc4 DHT | 418
440 | 0xfffe COM | 16 | Lavc60.31.102
458 | 0xffdb DQT | 67
527 | 0xffc4 DHT | 159
688 | 0xffc0 SOF0 | 17
707 | 0xffda SOS
$ xxd -s 2 -l 18 Logitech-C270-003-duplicate.jpg
00000002: ffe0 0010 4a46 4946 0001 0100 0000 0000 ....JFIF........
00000012: 0000 ..
现在有趣的是,FFmpeg 对于第一个 APP0 段没有问题,这似乎是错误的原因。但是,它实际上是第二段(使用 FFmpeg
-loglevel debug
运行)。
[AVFormatContext @ 0x5c53e38f3480] Opening 'Logitech-C270-003.jpg' for reading
[file @ 0x5c53e38f3b40] Setting default whitelist 'file,crypto,data'
[image2 @ 0x5c53e38f3480] Format image2 probed with size=2048 and score=50
[image2 @ 0x5c53e38f3480] Before avformat_find_stream_info() pos: 0 bytes read:32768 seeks:0 nb_streams:1
[mjpeg @ 0x5c53e38f4680] marker=d8 avail_size_in_buf=47603
[mjpeg @ 0x5c53e38f4680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x5c53e38f4680] marker=e0 avail_size_in_buf=47601
[mjpeg @ 0x5c53e38f4680] polarity 0
[mjpeg @ 0x5c53e38f4680] marker parser used 32 bytes (256 bits)
[mjpeg @ 0x5c53e38f4680] marker=db avail_size_in_buf=47566
[mjpeg @ 0x5c53e38f4680] index=0
[mjpeg @ 0x5c53e38f4680] qscale[0]: 6
[mjpeg @ 0x5c53e38f4680] marker parser used 67 bytes (536 bits)
[mjpeg @ 0x5c53e38f4680] marker=db avail_size_in_buf=47497
[mjpeg @ 0x5c53e38f4680] index=1
[mjpeg @ 0x5c53e38f4680] qscale[1]: 13
[mjpeg @ 0x5c53e38f4680] marker parser used 67 bytes (536 bits)
[mjpeg @ 0x5c53e38f4680] marker=dd avail_size_in_buf=47428
[mjpeg @ 0x5c53e38f4680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x5c53e38f4680] marker=e0 avail_size_in_buf=47422
[mjpeg @ 0x5c53e38f4680] unable to decode APP fields: Invalid data found when processing input
[mjpeg @ 0x5c53e38f4680] marker parser used 2 bytes (16 bits)
[mjpeg @ 0x5c53e38f4680] marker=c0 avail_size_in_buf=47416
[mjpeg @ 0x5c53e38f4680] Changing bps from 0 to 8
[mjpeg @ 0x5c53e38f4680] sof0: picture: 1280x720
[mjpeg @ 0x5c53e38f4680] component 0 2:1 id: 1 quant:0
[mjpeg @ 0x5c53e38f4680] component 1 1:1 id: 2 quant:1
[mjpeg @ 0x5c53e38f4680] component 2 1:1 id: 3 quant:1
[mjpeg @ 0x5c53e38f4680] pix fmt id 21111100
[mjpeg @ 0x5c53e38f4680] Format yuvj422p chosen by get_format().
[mjpeg @ 0x5c53e38f4680] marker parser used 17 bytes (136 bits)
[mjpeg @ 0x5c53e38f4680] escaping removed 772 bytes
[mjpeg @ 0x5c53e38f4680] marker=da avail_size_in_buf=47397
[mjpeg @ 0x5c53e38f4680] marker parser used 46625 bytes (373000 bits)
[mjpeg @ 0x5c53e38f4680] marker=d3 avail_size_in_buf=754
[mjpeg @ 0x5c53e38f4680] restart marker: 3
[mjpeg @ 0x5c53e38f4680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x5c53e38f4680] marker=d4 avail_size_in_buf=714
[mjpeg @ 0x5c53e38f4680] restart marker: 4
[mjpeg @ 0x5c53e38f4680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x5c53e38f4680] marker=d5 avail_size_in_buf=673
[mjpeg @ 0x5c53e38f4680] restart marker: 5
[mjpeg @ 0x5c53e38f4680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x5c53e38f4680] marker=d6 avail_size_in_buf=630
[mjpeg @ 0x5c53e38f4680] restart marker: 6
[mjpeg @ 0x5c53e38f4680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x5c53e38f4680] marker=d9 avail_size_in_buf=577
[mjpeg @ 0x5c53e38f4680] decode frame unused 577 bytes
[image2 @ 0x5c53e38f3480] After avformat_find_stream_info() pos: 47605 bytes read:47605 seeks:0 frames:1
注意开头的这些行。
[mjpeg @ 0x5c53e38f4680] marker=e0 avail_size_in_buf=47601
[mjpeg @ 0x5c53e38f4680] polarity 0
[mjpeg @ 0x5c53e38f4680] marker parser used 32 bytes (256 bits)
但是,在第二个标记(APP0 扩展标记)处它会出错。
[mjpeg @ 0x5c53e38f4680] marker=e0 avail_size_in_buf=47422
[mjpeg @ 0x5c53e38f4680] unable to decode APP fields: Invalid data found when processing input
[mjpeg @ 0x5c53e38f4680] marker parser used 2 bytes (16 bits)
它根本不合规,因为它没有立即跟随第一个 APP0,并且缺少必填字段
。
$ xxd -s 181 -l 6 Logitech-C270-003.jpg
000000b5: ffe0 0004 0000 ......
我发现,如果使用十六进制编辑器(也可能是自动化的)删除第二个 APP0,错误就会减轻。
[AVFormatContext @ 0x603be737e480] Opening 'Logitech-C270-003.jpg' for reading
[file @ 0x603be737eb40] Setting default whitelist 'file,crypto,data'
[image2 @ 0x603be737e480] Format image2 probed with size=2048 and score=50
[image2 @ 0x603be737e480] Before avformat_find_stream_info() pos: 0 bytes read:32768 seeks:0 nb_streams:1
[mjpeg @ 0x603be737f680] marker=d8 avail_size_in_buf=47597
[mjpeg @ 0x603be737f680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x603be737f680] marker=e0 avail_size_in_buf=47595
[mjpeg @ 0x603be737f680] polarity 0
[mjpeg @ 0x603be737f680] marker parser used 32 bytes (256 bits)
[mjpeg @ 0x603be737f680] marker=db avail_size_in_buf=47560
[mjpeg @ 0x603be737f680] index=0
[mjpeg @ 0x603be737f680] qscale[0]: 6
[mjpeg @ 0x603be737f680] marker parser used 67 bytes (536 bits)
[mjpeg @ 0x603be737f680] marker=db avail_size_in_buf=47491
[mjpeg @ 0x603be737f680] index=1
[mjpeg @ 0x603be737f680] qscale[1]: 13
[mjpeg @ 0x603be737f680] marker parser used 67 bytes (536 bits)
[mjpeg @ 0x603be737f680] marker=dd avail_size_in_buf=47422
[mjpeg @ 0x603be737f680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x603be737f680] marker=c0 avail_size_in_buf=47416
[mjpeg @ 0x603be737f680] Changing bps from 0 to 8
[mjpeg @ 0x603be737f680] sof0: picture: 1280x720
[mjpeg @ 0x603be737f680] component 0 2:1 id: 1 quant:0
[mjpeg @ 0x603be737f680] component 1 1:1 id: 2 quant:1
[mjpeg @ 0x603be737f680] component 2 1:1 id: 3 quant:1
[mjpeg @ 0x603be737f680] pix fmt id 21111100
[mjpeg @ 0x603be737f680] Format yuvj422p chosen by get_format().
[mjpeg @ 0x603be737f680] marker parser used 17 bytes (136 bits)
[mjpeg @ 0x603be737f680] escaping removed 772 bytes
[mjpeg @ 0x603be737f680] marker=da avail_size_in_buf=47397
[mjpeg @ 0x603be737f680] marker parser used 46625 bytes (373000 bits)
[mjpeg @ 0x603be737f680] marker=d3 avail_size_in_buf=754
[mjpeg @ 0x603be737f680] restart marker: 3
[mjpeg @ 0x603be737f680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x603be737f680] marker=d4 avail_size_in_buf=714
[mjpeg @ 0x603be737f680] restart marker: 4
[mjpeg @ 0x603be737f680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x603be737f680] marker=d5 avail_size_in_buf=673
[mjpeg @ 0x603be737f680] restart marker: 5
[mjpeg @ 0x603be737f680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x603be737f680] marker=d6 avail_size_in_buf=630
[mjpeg @ 0x603be737f680] restart marker: 6
[mjpeg @ 0x603be737f680] marker parser used 0 bytes (0 bits)
[mjpeg @ 0x603be737f680] marker=d9 avail_size_in_buf=577
[mjpeg @ 0x603be737f680] decode frame unused 577 bytes
[image2 @ 0x603be737e480] After avformat_find_stream_info() pos: 47599 bytes read:47599 seeks:0 frames:1
删除所有 APP0 段也可以。作为参考,FFmpeg 的 MJPEG 编码器不会生成任何 APP0 段。
$ ffmpeg -i Logitech-C270-003.jpg Logitech-C270-003-duplicate.jpg
...
Input #0, image2, from 'Logitech-C270-003.jpg':
Duration: 00:00:00.04, start: 0.000000, bitrate: 9512 kb/s
Stream #0:0: Video: mjpeg (Baseline), yuvj422p(pc, bt470bg/unknown/unknown), 1280x720, 25 fps, 25 tbr, 25 tbn
File 'Logitech-C270-003-duplicate.jpg' already exists. Overwrite? [y/N] y
Stream mapping:
Stream #0:0 -> #0:0 (mjpeg (native) -> mjpeg (native))
Press [q] to stop, [?] for help
Output #0, image2, to 'Logitech-C270-003-duplicate.jpg':
Metadata:
encoder : Lavf60.16.100
Stream #0:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown, progressive), 1280x720, q=2-31, 200 kb/s, 25 fps, 25 tbn
Metadata:
encoder : Lavc60.31.102 mjpeg
Side data:
cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: N/A
[image2 @ 0x5d6aa3c189c0] The specified filename 'Logitech-C270-003-duplicate.jpg' does not contain an image sequence pattern or a pattern is invalid.
[image2 @ 0x5d6aa3c189c0] Use a pattern such as %03d for an image sequence or use the -update option (with -frames:v 1 if needed) to write a single image.
[out#0/image2 @ 0x5d6aa3c134c0] video:40kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
frame= 1 fps=0.0 q=5.8 Lsize=N/A time=00:00:00.00 bitrate=N/A speed= 0x
$ exiv2 -pS Logitech-C270-003-duplicate.jpg
STRUCTURE OF JPEG FILE: Logitech-C270-003-duplicate.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xfffe COM | 16 | Lavc60.31.102
20 | 0xffdb DQT | 67
89 | 0xffc4 DHT | 159
250 | 0xffc0 SOF0 | 17
269 | 0xffda SOS