你可以检查标签的属性中是否不包含 lang
:
from lxml import etree
xml_string = '''
<components version="1.0.0">
<component type="foo">
<sample>Foo</sample>
<sample lang="a">abc</sample>
<sample lang="b">efj</sample>
</component>
</components>
'''
root = etree.fromstring(xml_string)
for sample in root.findall("component/sample"):
if "lang" not in sample.attrib:
print(sample.text)
输出:
Foo
编辑:如果你使用了命名空间 xml:
,可以尝试如下方式:
from lxml import etree
xml_string = '''
<components version="1.0.0">
<component type="foo">
<sample>Foo</sample>
<sample xml:lang="a">abc</sample>
<sample xml:lang="b">efj</sample>
</component>
</components>
'''
root = etree.fromstring(xml_string)
for sample in root.findall("component/sample"):
# 这里使用 http://www.w3.org/XML/1998/namespace 对应于 xml: 命名空间
# 或者根据你的文档中的不同命名空间使用相应的 Namespace URI
lang = sample.attrib.get(r"{http://www.w3.org/XML/1998/namespace}lang")
if not lang:
print(sample.text)