如何按 xsi:type 对 xml 文件进行排序?

问题描述 投票:0回答:1

我知道已经有几个关于 xml 排序的问题,但它们似乎都不适合我的情况。 我有以下 xml 文件,表示 esri 文件地理数据库的数据模式的剪裁:

import xml.etree.ElementTree as ET
from operator import attrgetter

data = """<esri:Workspace xmlns:esri='http://www.esri.com/schemas/ArcGIS/10.8' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xs='http://www.w3.org/2001/XMLSchema'>
    <WorkspaceDefinition xsi:type='esri:WorkspaceDefinition'>
        <WorkspaceType>esriLocalDatabaseWorkspace</WorkspaceType>
        <Version/>
        <Domains xsi:type='esri:ArrayOfDomain'/>
        <Sequences xsi:type='esri:ArrayOfSequence'/>
        <DatasetDefinitions xsi:type='esri:ArrayOfDataElement'>
            <DataElement xsi:type='esri:DEFeatureClass'/>
            <DataElement xsi:type='esri:DEFeatureClass'/>
            <DataElement xsi:type='esri:DEFeatureClass'/>
            <DataElement xsi:type='esri:DEFeatureDataset'/>
            <DataElement xsi:type='esri:DEFeatureClass'/>
            <DataElement xsi:type='esri:DEFeatureClass'/>
        </DatasetDefinitions>
    </WorkspaceDefinition>
    <WorkspaceData xsi:type='esri:WorkspaceData'/>
</esri:Workspace>"""    
    
root_1 = ET.fromstring(data)

我想按标签和 DataElement 类型对它进行排序,以便它像这样排序:

WorkspaceData {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:WorkspaceData'}
WorkspaceDefinition {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:WorkspaceDefinition'}
     DatasetDefinitions {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:ArrayOfDataElement'}
         DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
         DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
         DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
         DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
         DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
         DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureDataset'}
     Domains {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:ArrayOfDomain'}
     Sequences {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:ArrayOfSequence'}
     Version {}
     WorkspaceType {}

到目前为止,我设法按标签排序,但如何按 DataElement 类型排序?到目前为止,这是我的代码:

root_1[:] = sorted(root_1,  key=attrgetter("tag")) # WorkspaceData, WorkspaceDefinition
for node in root_1.findall("*"):  # DatasetDefinitions, Domains, Sequences, Version, WorkspaceType
    node[:] = sorted(node, key=attrgetter("tag"))
    print(node)
    for subnode in node.findall("*"): #DataElement, Domain
        subnode[:] = sorted(subnode, key=attrgetter("tag"))
        #subnode[:] = sorted(subnode, key=subnode.get['xsi:type']) # not working!
        print("\t", subnode.tag, subnode.attrib)
        for subsubnode in subnode.findall("*"): 
            print("\t\t", subsubnode.tag, subsubnode.attrib)
            subsubnode[:] = sorted(subsubnode,  key=attrgetter("tag"))
python xml sorting elementtree xsitype
1个回答
0
投票

IIUC,您可以稍微更改

key=
中的
sorted()
参数:

import xml.etree.ElementTree as ET
from operator import attrgetter

data = """<esri:Workspace xmlns:esri='http://www.esri.com/schemas/ArcGIS/10.8' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xs='http://www.w3.org/2001/XMLSchema'>
    <WorkspaceDefinition xsi:type='esri:WorkspaceDefinition'>
        <WorkspaceType>esriLocalDatabaseWorkspace</WorkspaceType>
        <Version/>
        <Domains xsi:type='esri:ArrayOfDomain'/>
        <Sequences xsi:type='esri:ArrayOfSequence'/>
        <DatasetDefinitions xsi:type='esri:ArrayOfDataElement'>
            <DataElement xsi:type='esri:DEFeatureClass'/>
            <DataElement xsi:type='esri:DEFeatureClass'/>
            <DataElement xsi:type='esri:DEFeatureClass'/>
            <DataElement xsi:type='esri:DEFeatureDataset'/>
            <DataElement xsi:type='esri:DEFeatureClass'/>
            <DataElement xsi:type='esri:DEFeatureClass'/>
        </DatasetDefinitions>
    </WorkspaceDefinition>
    <WorkspaceData xsi:type='esri:WorkspaceData'/>
</esri:Workspace>"""

root_1 = ET.fromstring(data)

root_1[:] = sorted(root_1, key=attrgetter("tag"))  # WorkspaceData, WorkspaceDefinition

for node in root_1.findall(
    "*"
):  # DatasetDefinitions, Domains, Sequences, Version, WorkspaceType
    node[:] = sorted(node, key=attrgetter("tag"))
    print(node)
    for subnode in node.findall("*"):  # DataElement, Domain
        subnode[:] = sorted(
            subnode,
            key=lambda node: (            # <--- change key= here
                node.tag,
                node.get("{http://www.w3.org/2001/XMLSchema-instance}type"),
            ),
        )
        print("\t", subnode.tag, subnode.attrib)
        for subsubnode in subnode.findall("*"):
            print("\t\t", subsubnode.tag, subsubnode.attrib)
            subsubnode[:] = sorted(
                subsubnode,
                key=attrgetter("tag"),
            )

打印:

<Element 'WorkspaceData' at 0x7f5ff630bec0>
<Element 'WorkspaceDefinition' at 0x7f5ff6316610>
         DatasetDefinitions {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:ArrayOfDataElement'}
                 DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
                 DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
                 DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
                 DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
                 DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureClass'}
                 DataElement {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:DEFeatureDataset'}
         Domains {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:ArrayOfDomain'}
         Sequences {'{http://www.w3.org/2001/XMLSchema-instance}type': 'esri:ArrayOfSequence'}
         Version {}
         WorkspaceType {}
© www.soinside.com 2019 - 2024. All rights reserved.