[英]Python lxml XPath SyntaxError: invalid predicate
Here is a part of an XML file that describes forms: 以下是描述表单的XML文件的一部分:
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfHouse>
<XmlForm>
<houseNum>1</houseNum>
<plan1>
<coord>
<X> 1.2 </X>
<Y> 2.1 </Y>
<Z> 3.0 </Z>
</coord>
<color>
<R> 255 </R>
<G> 0 </G>
<B> 0 </B>
</color>
</plan1>
<plan2>
<coord>
<X> 21.2 </X>
<Y> 22.1 </Y>
<Z> 31.0 </Z>
</coord>
<color>
<R> 255 </R>
<G> 0 </G>
<B> 0 </B>
</color>
</plan2>
</XmlForm>
<XmlForm>
<houseNum>2</houseNum>
<plan1>
<coord>
<X> 11.2 </X>
<Y> 12.1 </Y>
<Z> 13.0 </Z>
</coord>
<color>
<R> 255 </R>
<G> 255 </G>
<B> 0 </B>
</color>
</plan1>
<plan2>
<coord>
<X> 211.2 </X>
<Y> 212.1 </Y>
<Z> 311.0 </Z>
</coord>
<color>
<R> 255 </R>
<G> 0 </G>
<B> 255 </B>
</color>
</plan2>
</XmlForm>
</ArrayOfHouse>
Here is my code to recuperate the coordinates of each plan for the house 1 and 2, the problem is in the this line coord=tree.findall("XmlForm/[houseNum=str(houseindex)]
, the same problem is raised when using houseindex.__str__()
这是我的代码来恢复房子1和2的每个计划的坐标,问题是在这行
coord=tree.findall("XmlForm/[houseNum=str(houseindex)]
,使用时出现同样的问题houseindex.__str__()
import pandas as pd
import numpy as np
from lxml import etree
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
tree =etree.parse("myexample.xml")
#recuperate the columns name for pandas dataframe
planlist=tree.findall("XmlForm/[houseNum='1']/")
columns=[]
for el in planlist[1:]:
columns.append(el.tag)
#Declare pandas dataFrame
df=pd.DataFrame(columns=list('XYZ'),dtype=float)
for houseindex in range(0,2):
for index in range(len(columns)):
coord=tree.findall("XmlForm/[houseNum=str(houseindex)]/"+columns[index]+"/coord/")
XYZ=[]
for cc in coord:
XYZ.append(cc.text)
df.loc[index]=XYZ
print(df)
You clearly want str(houseindex)
to be interpreted in Python before constructing your XPath expression. 在构造XPath表达式之前,您显然希望在Python中解释
str(houseindex)
。 (Your error message is telling you that str()
isn't an XPath function.) (您的错误消息告诉您
str()
不是XPath函数。)
Therefore, change the argument of coord=tree.findall()
from 因此,更改
coord=tree.findall()
的参数
"XmlForm/[houseNum=str(houseindex)]/"+columns[index]+"/coord/"
to 至
"XmlForm/[houseNum="+str(houseindex)+"]/"+columns[index]+"/coord/"
Two more fixes to that XPath: 对XPath的另外两个修复:
/
before the predicate on XmlForm
. XmlForm
上的谓词之前的/
。 houseNum
. houseNum
的相等测试周围添加引号。 The following XPath has all three fixes combined and has no further syntax errors: 以下XPath结合了所有三个修复程序,并且没有进一步的语法错误:
"XmlForm[houseNum='"+str(houseindex)+"']/"+columns[index]+"/coord/"
You don't inject "houseindex" into your string. 你不要在你的字符串中注入“houseindex”。 Also be careful within your for loop of houseindex as you currently use range(0, 2) which corresponds to 0 and 1. Based on your xml example you rather want to use range(1, 3).
在你的forin循环中也要小心,因为你目前使用的范围(0,2)对应于0和1.根据你的xml例子,你宁愿使用范围(1,3)。
I believe you want to have something like this (I slightly refactored your code to improve readability): 我相信你想要这样的东西(我稍微重构了你的代码以提高可读性):
import pandas as pd
from lxml import etree
tree = etree.parse("myexample.xml")
# recuperate the columns name for pandas dataframe
plan_list = tree.findall("XmlForm/[houseNum='1']/")
columns = [el.tag for el in plan_list[1:]]
# Declare pandas dataFrame
data = list()
for house_index in range(1, 3):
for column in columns:
element_text = "XmlForm/[houseNum='{index}']/{column}/coord/".format(index=house_index, column=column)
coord = tree.findall(element_text)
row = [cc.text for cc in coord]
data.append(row)
df = pd.DataFrame(data, columns=list('XYZ'), dtype=float)
print(df)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.