如何在shell脚本中解析xml文件中的数据

问题描述 投票:0回答:1

我有这个 xml 文件,我正在尝试使用 xmllint 或 grep 来获得这样的输出

availStart="2024-05-24T05:00:30" availId="811220455" q1:campaignIdRef=180398
availStart="2024-05-24T05:13:41" availId="811220484" q2:campaignIdRef=179163
availStart="2024-05-24T05:13:41" availId="811220484" q2:campaignIdRef=179166
availStart="2024-05-24T05:13:41" availId="811220484" q2:campaignIdRef=179164
availStart="2024-05-24T05:13:41" availId="811220484" q2:campaignIdRef=179165

XML 文件如下,我尝试使用 while read Line 方法和一些 grep 命令,但没有获得完整的捕获。另外 xmllint 我正在努力使用 xpath 或 path 选项。有人可以帮忙吗?

<?xml version="1.0" encoding="utf-8"?>
<Schedule xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="urn:scte:118:version01" level="0" revision="1" zoneName="test" networkName="XXX" schemaVersion="urn:scte:118:version01" begDateTime="2024-05-24T06:00:00" broadcastDate="20240524" endDateTime="2024-05-25T05:59:59">
  <Window windowStart="2024-05-24T04:00:30" windowDuration="02000000">
    <Avail xmlns:q1="urn:nds:dynamic:ruleManagerICD:01:01" xsi:type="q1:NDSAvailType" availStart="2024-05-24T05:00:30" availInWindow="1" availId="811220455" eventKey="811220455" durationFrames="0" durationSeconds="15" framesPerSecond="25Frames" breakId="1">
      <Spot xsi:type="q1:SpotRulesType" spotId="1" length="00001500" positionInAvail="1" trafficId="312999" eventType="LOI" lengthFrames="0">
        <q1:substitutionOptionSlot>
          <q1:adInSpot>
            <q1:campaignRef>
              <q1:campaignIdRef>180398</q1:campaignIdRef>
            </q1:campaignRef>
          </q1:adInSpot>
        </q1:substitutionOptionSlot>
      </Spot>
      <q1:inactivityTime primaryDevice="7200" secondaryDevice="5400"/>
    </Avail>
  </Window>
  <Window windowStart="2024-05-24T04:13:41" windowDuration="02000000">
    <Avail xmlns:q2="urn:nds:dynamic:ruleManagerICD:01:01" xsi:type="q2:NDSAvailType" availStart="2024-05-24T05:13:41" availInWindow="1" availId="811220484" eventKey="811220484" durationFrames="0" durationSeconds="10" framesPerSecond="25Frames" breakId="2">
      <Spot xsi:type="q2:SpotRulesType" spotId="1" length="00001000" positionInAvail="1" trafficId="312999" eventType="LOI" lengthFrames="0">
        <q2:substitutionOptionSlot>
          <q2:adInSpot>
            <q2:campaignRef>
              <q2:campaignIdRef>179163</q2:campaignIdRef>
            </q2:campaignRef>
            <q2:campaignRef>
              <q2:campaignIdRef>179166</q2:campaignIdRef>
            </q2:campaignRef>
            <q2:campaignRef>
              <q2:campaignIdRef>179164</q2:campaignIdRef>
            </q2:campaignRef>
            <q2:campaignRef>
              <q2:campaignIdRef>179165</q2:campaignIdRef>
            </q2:campaignRef>
          </q2:adInSpot>
        </q2:substitutionOptionSlot>
      </Spot>
      <q2:inactivityTime primaryDevice="7200" secondaryDevice="5400"/>
    </Avail>
  </Window>
</Schedule>

我尝试过这个,但我没有从 $avail varibale 中获得任何价值。只有第一个变量 $time 被填充

while read LINE; do time=
grep -io -P 'availstart=.{20}'
 && avail=
grep -io -P 'availId=.{11}'
;echo -e "\n$time and $avail";done < sample.xml

xml shell xml-namespaces xmllint
1个回答
0
投票

xmllint
+
xpath
是完成这项工作的好选择

#!/bin/bash
line=''
prevline=''
while IFS='=' read attr value;do
    #echo "$attr | $value"
    if [ -n "$value" ];then
       line+="$attr="'"'"$value"'"'
        if [ "$attr" = ' availId' ];then
            prevline="$line"
        fi
    else
        echo -e "$prevline q2:campaignIdRef=$attr"
        line=''
    fi
done < <(xmllint --xpath '//@availStart | //@availId | //*[local-name()="Avail"]//*[local-name()="campaignIdRef"]/text()' tmp2.xml | tr -d '"')

结果

 availStart="2024-05-24T05:00:30" availId="811220455" q2:campaignIdRef=180398
 availStart="2024-05-24T05:13:41" availId="811220484" q2:campaignIdRef=179163
 availStart="2024-05-24T05:13:41" availId="811220484" q2:campaignIdRef=179166
 availStart="2024-05-24T05:13:41" availId="811220484" q2:campaignIdRef=179164
 availStart="2024-05-24T05:13:41" availId="811220484" q2:campaignIdRef=179165
© www.soinside.com 2019 - 2024. All rights reserved.