[英]How to check xmlns in every element using lxml
I am using lxml to check Product elements as they stream in a MapReduce job. 我正在使用lxml检查Product元素,因为它们在MapReduce作业中流式传输。 I am trying to make sure that only the correct xmlns value is present in every element. 我试图确保每个元素中仅存在正确的xmlns值。 For example, every Product element should have an xmlns set to " http://mynetwork.products.com/new ": 例如,每个Product元素都应将xmlns设置为“ http://mynetwork.products.com/new ”:
<Product xmlns="http://mynetwork.products.com/new">
As I check each Product element (streamed one at a time), I just want to make sure that it looks like the above. 当我检查每个Product元素(一次流式传输)时,我只想确保它看起来像上面的一样。 I want to check for the following potential errors: 我想检查以下潜在错误:
<Product xmlns="http://mynetwork.products.com/old">
<Product xmlns="">
<Product>
<Product xmlns="http://mynetwork.products.com/new" something="else">
I tried storing the value of Product.nsmap for each element (which is a dictionary) and then reading the values of the dictionary to validate, but it doesn't help me detect any of the below cases. 我尝试为每个元素(这是一个字典)存储Product.nsmap的值,然后读取字典的值进行验证,但是这无助于我发现以下任何一种情况。 There must be a way. 一定有办法。
You can check combination of nsmap
and attrib
properties of each Product
element. 您可以检查每个Product
元素的nsmap
和attrib
属性的组合。 nsmap
should contains only one key value pair ie key None
with value "http://mynetwork.products.com/new"
, and attrib
should be empty since you won't allow any attributes in the element. nsmap
应该仅包含一个键值对,即键None
,其值应为"http://mynetwork.products.com/new"
,并且attrib
应该为空,因为您不允许该元素中的任何属性。
Brief example (pyhon 2.7) : 简短示例(pyhon 2.7):
>>> from lxml import etree
>>> raw = '''<root>
... <Product xmlns="http://mynetwork.products.com/new"/>
... <Product xmlns="http://mynetwork.products.com/new" something="else"/>
... <Product xmlns="http://mynetwork.products.com/old" />
... <Product xmlns=""/>
... <Product/>
... </root>'''
...
>>> root = etree.fromstring(raw)
>>> for p in root.findall('*'):
... isValid = len(p.nsmap) == 1 \
... and p.nsmap[None] == 'http://mynetwork.products.com/new' \
... and not p.attrib
... print isValid
...
True
False
False
False
False
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.