当使用 lxml 处理 XML 时,我的代码最终如下所示:
from lxml import etree
NSMAP = {
'ns1': 'https://example.com/ns1/',
'ns2': 'https://example.com/ns2/',
}
root = etree.parse('/some/file.xml')
root.find(..., namespaces=NSMAP)
root.iterfind(..., namespaces=NSMAP)
root.xpath(..., namespaces=NSMAP)
有没有办法省略重复的
namespaces=NSMAP
并将其设置为元素的默认值?
您可以通过在 lxml 中设置元素的默认命名空间来简化代码。您可以通过创建包含默认命名空间的自定义 Element 类来实现此目的。以下是如何实现此目标的示例:
from lxml import etree
class DefaultNamespaceElement(etree.ElementBase):
def find(self, path, *args, **kwargs):
return super().find(path, namespaces=NSMAP, *args, **kwargs)
def iterfind(self, path, *args, **kwargs):
return super().iterfind(path, namespaces=NSMAP, *args, **kwargs)
def xpath(self, path, *args, **kwargs):
return super().xpath(path, namespaces=NSMAP, *args, **kwargs)
# Create a custom parser with the DefaultNamespaceElement class
parser = etree.XMLParser(target=DefaultNamespaceElement)
# Parse your XML file using the custom parser
root = etree.parse('/some/file.xml', parser)
# Now you can use the find, iterfind, and xpath methods without specifying namespaces
result = root.find('...', namespaces=NSMAP) # No need to specify namespaces=NSMAP