如何使用 sed 识别 mediainfo 输出中的文本块以生成操作命令

问题描述 投票:0回答:1

查看之前发布的问题和答案

查找并返回包含字符串的行块

用户@potong 通过相当简单的命令提供了一个优雅的解决方案。

我可以对针对媒体数据文件运行的 mediainfo 命令的输出执行一些尝试,以生成文件流的文本输出。

General
Unique ID                                : 
 (0xDAC55CA81AA8F777EB9DE67AC6)
Complete name                            : Some Media File.mkv
Format                                   : Matroska
Format version                           : Version 4
File size                                : 1.44 GiB
Duration                                 : 46 min 22 s
Overall bit rate                         : 4 761 kb/s
Frame rate                               : 23.976 FPS
Encoded date                             : 2023-08-08 06:39:11 UTC
Writing application                      : mkvmerge v76.0 ('Celebration') 64-bit
Writing library                          : libebml v1.4.4 + libmatroska v1.7.1

Video
ID                                       : 1
Format                                   : HEVC
Format/Info                              : High Efficiency Video Coding
Format profile                           : Main 10@L4@Main
Codec ID                                 : V_MPEGH/ISO/HEVC
Duration                                 : 46 min 22 s
Bit rate                                 : 4 501 kb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 23.976 (24000/1001) FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 10 bits
Bits/(Pixel*Frame)                       : 0.091
Stream size                              : 1.36 GiB (95%)
Writing library                          : x265 3.5+96-d844ab494:[Windows][GCC 12.2.0][64 bit] 10bit
Encoding settings                        : cpuid=1111039 / frame-threads=3 / numa-pools=8 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x1080 / interlace=0 / total-frames=0 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=4 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-eob / no-eos / no-hrd / info / hash=0 / temporal-layers=0 / open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=8 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=25 / lookahead-slices=4 / scenecut=40 / no-hist-scenecut / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=2 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=3 / limit-refs=3 / limit-modes / me=3 / subme=3 / merange=57 / temporal-mvp / no-frame-dup / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / no-sao / no-sao-non-deblock / rd=4 / selective-sao=0 / no-early-skip / rskip / no-fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=1.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=abr / bitrate=4500 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=2 / cplxblur=20.0 / qblur=0.5 / ipratio=1.40 / pbratio=1.30 / aq-mode=3 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=1 / overscan=0 / videoformat=5 / range=0 / colorprim=2 / transfer=2 / colormatrix=2 / chromaloc=0 / display-window=0 / cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / aq-motion / no-hdr10 / no-hdr10-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=0 / analysis-save-reuse-level=0 / analysis-load-reuse-level=0 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00 / scenecut-aware-qp=0conformance-window-offsets / right=0 / bottom=0 / decoder-max-rate=0 / no-vbv-live-multi-pass / no-mcstf / no-sbrc
Default                                  : Yes
Forced                                   : Yes
Color range                              : Limited

Audio
ID                                       : 2
Format                                   : E-AC-3
Format/Info                              : Enhanced AC-3
Commercial name                          : Dolby Digital Plus
Codec ID                                 : A_EAC3
Duration                                 : 46 min 22 s
Bit rate mode                            : Constant
Bit rate                                 : 256 kb/s
Channel(s)                               : 6 channels
Channel layout                           : L R C LFE Ls Rs
Sampling rate                            : 48.0 kHz
Frame rate                               : 31.250 FPS (1536 SPF)
Compression mode                         : Lossy
Stream size                              : 79.3 MiB (5%)
Title                                    : English
Language                                 : English
Service kind                             : Complete Main
Default                                  : Yes
Forced                                   : No
Dialog Normalization                     : -27 dB
compr                                    : -0.28 dB
mixlevel                                 : 105 dB
roomtyp                                  : Small
ltrtcmixlev                              : 3.0 dB
ltrtsurmixlev                            : -3.0 dB
lorocmixlev                              : 3.0 dB
lorosurmixlev                            : -3.0 dB
dialnorm_Average                         : -27 dB
dialnorm_Minimum                         : -27 dB
dialnorm_Maximum                         : -27 dB

Text #1
ID                                       : 3
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 77 b/s
Frame rate                               : 0.255 FPS
Count of elements                        : 649
Stream size                              : 24.1 KiB (0%)
Language                                 : French
Default                                  : No
Forced                                   : No

Text #2
ID                                       : 4
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 76 b/s
Frame rate                               : 0.259 FPS
Count of elements                        : 659
Stream size                              : 23.8 KiB (0%)
Language                                 : German
Default                                  : No
Forced                                   : No

Text #3
ID                                       : 5
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 26 min 14 s
Bit rate                                 : 0 b/s
Frame rate                               : 0.003 FPS
Count of elements                        : 5
Stream size                              : 91.0 Bytes (0%)
Language                                 : Italian
Default                                  : No
Forced                                   : No

Text #4
ID                                       : 6
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 75 b/s
Frame rate                               : 0.260 FPS
Count of elements                        : 661
Stream size                              : 23.3 KiB (0%)
Language                                 : Italian
Default                                  : No
Forced                                   : No

Text #5
ID                                       : 7
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 21 s
Bit rate                                 : 56 b/s
Frame rate                               : 0.253 FPS
Count of elements                        : 643
Stream size                              : 17.5 KiB (0%)
Language                                 : Japanese
Default                                  : No
Forced                                   : No

Text #6
ID                                       : 8
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 84 b/s
Frame rate                               : 0.260 FPS
Count of elements                        : 660
Stream size                              : 26.2 KiB (0%)
Language                                 : Korean
Default                                  : No
Forced                                   : No

Text #7
ID                                       : 9
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 71 b/s
Frame rate                               : 0.258 FPS
Count of elements                        : 656
Stream size                              : 22.0 KiB (0%)
Language                                 : Norwegian
Default                                  : No
Forced                                   : No

Text #8
ID                                       : 10
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 30 min 43 s
Bit rate                                 : 0 b/s
Frame rate                               : 0.003 FPS
Count of elements                        : 6
Stream size                              : 75.0 Bytes (0%)
Language                                 : Polish
Default                                  : No
Forced                                   : No

Text #9
ID                                       : 11
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 70 b/s
Frame rate                               : 0.261 FPS
Count of elements                        : 663
Stream size                              : 21.9 KiB (0%)
Language                                 : Polish
Default                                  : No
Forced                                   : No

Text #10
ID                                       : 12
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 74 b/s
Frame rate                               : 0.261 FPS
Count of elements                        : 663
Stream size                              : 23.1 KiB (0%)
Language                                 : Portuguese
Default                                  : No
Forced                                   : No

Text #11
ID                                       : 13
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 79 b/s
Frame rate                               : 0.260 FPS
Count of elements                        : 660
Stream size                              : 24.7 KiB (0%)
Language                                 : Portuguese
Default                                  : No
Forced                                   : No

Text #12
ID                                       : 14
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 77 b/s
Frame rate                               : 0.260 FPS
Count of elements                        : 662
Stream size                              : 24.2 KiB (0%)
Language                                 : Spanish
Default                                  : No
Forced                                   : No

Text #13
ID                                       : 15
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 30 min 41 s
Bit rate                                 : 0 b/s
Frame rate                               : 0.003 FPS
Count of elements                        : 6
Stream size                              : 77.0 Bytes (0%)
Language                                 : Spanish
Default                                  : No
Forced                                   : No

Text #14
ID                                       : 16
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 69 b/s
Frame rate                               : 0.254 FPS
Count of elements                        : 645
Stream size                              : 21.6 KiB (0%)
Language                                 : Spanish
Default                                  : No
Forced                                   : No

Text #15
ID                                       : 17
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 74 b/s
Frame rate                               : 0.258 FPS
Count of elements                        : 657
Stream size                              : 23.1 KiB (0%)
Language                                 : Swedish
Default                                  : No
Forced                                   : No

Text #16
ID                                       : 18
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 21 s
Bit rate                                 : 84 b/s
Frame rate                               : 0.311 FPS
Count of elements                        : 791
Stream size                              : 26.3 KiB (0%)
Language                                 : English
Default                                  : No
Forced                                   : No

Text #17
ID                                       : 19
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 75 b/s
Frame rate                               : 0.258 FPS
Count of elements                        : 655
Stream size                              : 23.4 KiB (0%)
Language                                 : English
Default                                  : No
Forced                                   : No

Text #18
ID                                       : 20
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 43 min 9 s
Bit rate                                 : 69 b/s
Frame rate                               : 0.285 FPS
Count of elements                        : 739
Stream size                              : 21.9 KiB (0%)
Language                                 : Chinese
Default                                  : No
Forced                                   : No

Text #19
ID                                       : 21
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 62 b/s
Frame rate                               : 0.261 FPS
Count of elements                        : 663
Stream size                              : 19.4 KiB (0%)
Language                                 : Chinese
Default                                  : No
Forced                                   : No

Text #20
ID                                       : 22
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 73 b/s
Frame rate                               : 0.259 FPS
Count of elements                        : 659
Stream size                              : 22.9 KiB (0%)
Language                                 : Danish
Default                                  : No
Forced                                   : No

Text #21
ID                                       : 23
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 64 b/s
Frame rate                               : 0.249 FPS
Count of elements                        : 634
Stream size                              : 19.9 KiB (0%)
Language                                 : Dutch
Default                                  : No
Forced                                   : No

Text #22
ID                                       : 24
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 74 b/s
Frame rate                               : 0.259 FPS
Count of elements                        : 658
Stream size                              : 23.2 KiB (0%)
Language                                 : Finnish
Default                                  : No
Forced                                   : No

Text #23
ID                                       : 25
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 26 min 14 s
Bit rate                                 : 0 b/s
Frame rate                               : 0.003 FPS
Count of elements                        : 5
Stream size                              : 70.0 Bytes (0%)
Language                                 : French
Default                                  : No
Forced                                   : No

所以使用这个命令

sed -n '/^Text #/!{H;$!d};x;/English/p' 'Some Media File.mkv.MDINFO'

我得到这个输出

General
Unique ID                                : 
 (0xDAC55CA81AA8F777EB9DE67AC6)
Complete name                            : Some Media File.mkv
Format                                   : Matroska
Format version                           : Version 4
File size                                : 1.44 GiB
Duration                                 : 46 min 22 s
Overall bit rate                         : 4 761 kb/s
Frame rate                               : 23.976 FPS
Encoded date                             : 2023-08-08 06:39:11 UTC
Writing application                      : mkvmerge v76.0 ('Celebration') 64-bit
Writing library                          : libebml v1.4.4 + libmatroska v1.7.1

Video
ID                                       : 1
Format                                   : HEVC
Format/Info                              : High Efficiency Video Coding
Format profile                           : Main 10@L4@Main
Codec ID                                 : V_MPEGH/ISO/HEVC
Duration                                 : 46 min 22 s
Bit rate                                 : 4 501 kb/s
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 23.976 (24000/1001) FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 10 bits
Bits/(Pixel*Frame)                       : 0.091
Stream size                              : 1.36 GiB (95%)
Writing library                          : x265 3.5+96-d844ab494:[Windows][GCC 12.2.0][64 bit] 10bit
Encoding settings                        : cpuid=1111039 / frame-threads=3 / numa-pools=8 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x1080 / interlace=0 / total-frames=0 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=4 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-eob / no-eos / no-hrd / info / hash=0 / temporal-layers=0 / open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=8 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=25 / lookahead-slices=4 / scenecut=40 / no-hist-scenecut / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=2 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=3 / limit-refs=3 / limit-modes / me=3 / subme=3 / merange=57 / temporal-mvp / no-frame-dup / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / no-sao / no-sao-non-deblock / rd=4 / selective-sao=0 / no-early-skip / rskip / no-fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=1.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=abr / bitrate=4500 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=2 / cplxblur=20.0 / qblur=0.5 / ipratio=1.40 / pbratio=1.30 / aq-mode=3 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=1 / overscan=0 / videoformat=5 / range=0 / colorprim=2 / transfer=2 / colormatrix=2 / chromaloc=0 / display-window=0 / cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / aq-motion / no-hdr10 / no-hdr10-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=0 / analysis-save-reuse-level=0 / analysis-load-reuse-level=0 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00 / scenecut-aware-qp=0conformance-window-offsets / right=0 / bottom=0 / decoder-max-rate=0 / no-vbv-live-multi-pass / no-mcstf / no-sbrc
Default                                  : Yes
Forced                                   : Yes
Color range                              : Limited

Audio
ID                                       : 2
Format                                   : E-AC-3
Format/Info                              : Enhanced AC-3
Commercial name                          : Dolby Digital Plus
Codec ID                                 : A_EAC3
Duration                                 : 46 min 22 s
Bit rate mode                            : Constant
Bit rate                                 : 256 kb/s
Channel(s)                               : 6 channels
Channel layout                           : L R C LFE Ls Rs
Sampling rate                            : 48.0 kHz
Frame rate                               : 31.250 FPS (1536 SPF)
Compression mode                         : Lossy
Stream size                              : 79.3 MiB (5%)
Title                                    : English
Language                                 : English
Service kind                             : Complete Main
Default                                  : Yes
Forced                                   : No
Dialog Normalization                     : -27 dB
compr                                    : -0.28 dB
mixlevel                                 : 105 dB
roomtyp                                  : Small
ltrtcmixlev                              : 3.0 dB
ltrtsurmixlev                            : -3.0 dB
lorocmixlev                              : 3.0 dB
lorosurmixlev                            : -3.0 dB
dialnorm_Average                         : -27 dB
dialnorm_Minimum                         : -27 dB
dialnorm_Maximum                         : -27 dB

Text #16
ID                                       : 18
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 21 s
Bit rate                                 : 84 b/s
Frame rate                               : 0.311 FPS
Count of elements                        : 791
Stream size                              : 26.3 KiB (0%)
Language                                 : English
Default                                  : No
Forced                                   : No

Text #17
ID                                       : 19
Format                                   : UTF-8
Codec ID                                 : S_TEXT/UTF8
Codec ID/Info                            : UTF-8 Plain Text
Duration                                 : 46 min 22 s
Bit rate                                 : 75 b/s
Frame rate                               : 0.258 FPS
Count of elements                        : 655
Stream size                              : 23.4 KiB (0%)
Language                                 : English
Default                                  : No
Forced                                   : No

不是 sed 专家,我试图理解为什么 一般情况下,视频和音频块是否正在打印? 我认为 sed 命令中的 /^Text #/ 会排除任何不以“Text #”开头的数据块?

仅带有英语注释的文本块的输出是完美的,因为我想做的是处理大约五十个这样的文件(以后可能会更多),其中英文字幕文本是我感兴趣的唯一流,但是英文文本在哪个流中并不一致。

因此,我的脚本的目的是专门识别英语文本流的流编号,然后通过 MKVToolNix 在以下脚本中使用该信息来剥离所有非英语文本流,只保留英语文本流。我可能还会将英语流转储到 .srt 文件中以单独保存作为备份。

sed
1个回答
0
投票

这可能对你有用(GNU sed):

sed -n '/^Text #/!{H;$!d};x;/^General$/Md;/English/p' file

指令

/^General$/Md
将丢弃序言,即第一行
Text #
之前的任何文本。

或者您可能更喜欢:

sed -n '/^Text #/!{H;$!d};x;/^\nGeneral\n/d;/English/p' file
© www.soinside.com 2019 - 2024. All rights reserved.