XMLParser在解析XML模式文件时跳过属性

问题描述 投票:0回答:1

我需要阅读XML Schema File并仅提取这些具有字段minOccurs="0"的元素。但是当XML解析器在解析文档时跳过该字段时,我遇到了问题。

就像我在下面的代码中向您展示的。

我有一个示例XML文件:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
    <xsd:include schemaLocation="def.xml"/>
    <xsd:element name="MainElementName">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="A">
                    <xsd:complexType>
                        <xsd:attribute name="AA" required="False" type="string"/>
                    </xsd:complexType>
                </xsd:element>
                <xsd:element name="B" minOccurs="0" maxOccurs="unbounded">
                    <xsd:complexType>
                        <xsd:attribute name="BA" type="string"/>
                    </xsd:complexType>
                </xsd:element>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

然后我用以下代码解析:

    with open(xsd_path, 'r'):
        try:
            parser = et.XMLParser(remove_blank_text=True)
            tree = et.parse(xsd_path, parser)
            tmp_text = et.tostring(tree, pretty_print=True, encoding=str)
        except IOError as e:
            print(e)

我得到输出:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
    <xsd:include schemaLocation="def.xml"/>
    <xsd:element name="MainElementName">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="A">
                    <xsd:complexType>
                        <xsd:attribute name="AA" type="string"/>
                    </xsd:complexType>
                </xsd:element>
                <xsd:element name="B">
                    <xsd:complexType>
                        <xsd:attribute name="BA" type="string"/>
                    </xsd:complexType>
                </xsd:element>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

我不知道为什么解析器跳过属性中的required字段和元素中的min/maxOccurs。有谁知道该怎么解决?

python xsd lxml
1个回答
0
投票

您的代码不正确,我无法对其进行编译。例如,ElementTree.tostring()接受一个Element实例,但是您要传递一个ElementTree实例(tree)。

此代码对我有用:

import xml.etree.ElementTree as et

parser = et.XMLParser()
tree = et.parse('/path/to.xml', parser)
tmp_text = et.tostring(tree.getroot(), encoding='unicode')
print(tmp_text)

对于Python 3.6.8,输出为:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:include schemaLocation="def.xml" />
    <xs:element name="MainElementName">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="A">
                    <xs:complexType>
                        <xs:attribute name="AA" required="False" type="string" />
                    </xs:complexType>
                </xs:element>
                <xs:element maxOccurs="unbounded" minOccurs="0" name="B">
                    <xs:complexType>
                        <xs:attribute name="BA" type="string" />
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>
© www.soinside.com 2019 - 2024. All rights reserved.