[英]Extract an attribute value , Lxml
I have the following Xml file: 我有以下Xml文件:
'<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\r\n<w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 wp14"><w:body><w:p w:rsidR="00706A37" w:rsidRPr="004A1CE5" w:rsidRDefault="004A1CE5"><w:pPr><w:pStyle w:val="Heading1"/><w:numPr><w:ilvl w:val="12"/><w:numId w:val="0"/></w:numPr><w:rPr><w:sz w:val="28"/><w:szCs w:val="28"/></w:rPr></w:pPr><w:commentRangeStart w:id="0"/><w:r w:rsidRPr="004A1CE5"><w:rPr><w:sz w:val="28"/><w:szCs w:val="28"/></w:rPr><w:t>H</w:t></w:r><w:commentRangeEnd w:id="0"/><w:r w:rsidR="00A23794"><w:rPr><w:rStyle w:val="CommentReference"/>
And I need to extract the value of id
within a <w:commentRangeStart>
tag . 我需要在<w:commentRangeStart>
标记中提取id
的值。 I have looked over many questions on SO and found the following type: 我查看了关于SO的许多问题,发现以下类型:
I tried: (iterate over every p with a commentRangeStart tag , and retrieve attrib. This returned nothing. 我试过了:(用commentRangeStart标签迭代每个p,并检索attrib。这没有返回任何结果。
for p in lxml_tree.xpath('.//w:p/commentRangeStart',namespaces = {'w':w}):
print p.attrib
I tried various combinations with 'commentRangeStart[@id]'
and commentRangeStart/@id
but None worked. 我尝试使用'commentRangeStart[@id]'
和commentRangeStart/@id
各种组合,但commentRangeStart/@id
效果。 I referred to many questions and one of them is here . 我提到了许多问题,其中一个是在这里 。
I would prefer a way in which it would go over every p and then search for the comment tag. 我希望有一种方法可以遍历每个p然后搜索comment标签。 Like: 喜欢:
for p in lxml_tree.xpath('.//w:p',namespaces = {'w':w}):
p.xpath(./w:commentRangeStart/...)
and so on.. 等等..
What's wrong with my expression.?? 我的表情怎么了?
You need to qualify namespace: 您需要限定名称空间:
for p in root.xpath('.//w:p/w:commentRangeStart', namespaces={'w':w}):
print p.attrib
output: 输出:
{'{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id': '0'}
Alternative: 替代方案:
for id_ in root.xpath('.//w:p/w:commentRangeStart/@w:id', namespaces={'w': w}):
print id_
output: 输出:
0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.