简体   繁体   English

Python lxml XPath SyntaxError:无效的谓词

[英]Python lxml XPath SyntaxError: invalid predicate

Here is a part of an XML file that describes forms: 以下是描述表单的XML文件的一部分:

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfHouse>
<XmlForm>
<houseNum>1</houseNum>
 <plan1> 
  <coord>
    <X> 1.2  </X>
    <Y> 2.1  </Y>
    <Z> 3.0  </Z>
  </coord>
  <color> 
    <R> 255 </R>
    <G> 0   </G>
    <B> 0   </B>
  </color>
 </plan1>
 <plan2>
  <coord>  
    <X> 21.2  </X>
    <Y> 22.1  </Y>
    <Z> 31.0  </Z>
  </coord>
  <color> 
    <R> 255 </R>
    <G> 0   </G>
    <B> 0   </B>
</color>
 </plan2> 
</XmlForm>


<XmlForm>
<houseNum>2</houseNum>
 <plan1> 
  <coord>
    <X> 11.2  </X>
    <Y> 12.1  </Y>
    <Z> 13.0  </Z>
  </coord>
  <color> 
    <R> 255 </R>
    <G> 255   </G>
    <B> 0   </B>
  </color>
 </plan1>
 <plan2>
  <coord>  
    <X> 211.2  </X>
    <Y> 212.1  </Y>
    <Z> 311.0  </Z>
  </coord>
  <color> 
    <R> 255 </R>
    <G> 0   </G>
    <B> 255   </B>
</color>
 </plan2> 
</XmlForm>
</ArrayOfHouse>

Here is my code to recuperate the coordinates of each plan for the house 1 and 2, the problem is in the this line coord=tree.findall("XmlForm/[houseNum=str(houseindex)] , the same problem is raised when using houseindex.__str__() 这是我的代码来恢复房子1和2的每个计划的坐标,问题是在这行coord=tree.findall("XmlForm/[houseNum=str(houseindex)] ,使用时出现同样的问题houseindex.__str__()

import pandas as pd
import numpy as np
from lxml import etree
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
tree =etree.parse("myexample.xml")
#recuperate the columns name for pandas dataframe
planlist=tree.findall("XmlForm/[houseNum='1']/")

columns=[]

for el in planlist[1:]:
    columns.append(el.tag)

#Declare pandas dataFrame
df=pd.DataFrame(columns=list('XYZ'),dtype=float)
for houseindex in range(0,2):
    for index in range(len(columns)):

        coord=tree.findall("XmlForm/[houseNum=str(houseindex)]/"+columns[index]+"/coord/")
        XYZ=[]
        for cc in coord:
            XYZ.append(cc.text)
        df.loc[index]=XYZ
print(df)

You clearly want str(houseindex) to be interpreted in Python before constructing your XPath expression. 在构造XPath表达式之前,您显然希望在Python中解释str(houseindex) (Your error message is telling you that str() isn't an XPath function.) (您的错误消息告诉您str()不是XPath函数。)

Therefore, change the argument of coord=tree.findall() from 因此,更改coord=tree.findall()的参数

"XmlForm/[houseNum=str(houseindex)]/"+columns[index]+"/coord/"

to

"XmlForm/[houseNum="+str(houseindex)+"]/"+columns[index]+"/coord/"

Two more fixes to that XPath: 对XPath的另外两个修复:

  1. Remove the / before the predicate on XmlForm . 删除XmlForm上的谓词之前的/
  2. Add quotes around the equality test of houseNum . houseNum的相等测试周围添加引号。

Final XPath with no further syntax errors 最终的XPath没有进一步的语法错误

The following XPath has all three fixes combined and has no further syntax errors: 以下XPath结合了所有三个修复程序,并且没有进一步的语法错误:

"XmlForm[houseNum='"+str(houseindex)+"']/"+columns[index]+"/coord/"

You don't inject "houseindex" into your string. 你不要在你的字符串中注入“houseindex”。 Also be careful within your for loop of houseindex as you currently use range(0, 2) which corresponds to 0 and 1. Based on your xml example you rather want to use range(1, 3). 在你的forin循环中也要小心,因为你目前使用的范围(0,2)对应于0和1.根据你的xml例子,你宁愿使用范围(1,3)。

I believe you want to have something like this (I slightly refactored your code to improve readability): 我相信你想要这样的东西(我稍微重构了你的代码以提高可读性):

import pandas as pd
from lxml import etree

tree = etree.parse("myexample.xml")

# recuperate the columns name for pandas dataframe
plan_list = tree.findall("XmlForm/[houseNum='1']/")
columns = [el.tag for el in plan_list[1:]]

# Declare pandas dataFrame
data = list()
for house_index in range(1, 3):
    for column in columns:

        element_text = "XmlForm/[houseNum='{index}']/{column}/coord/".format(index=house_index, column=column)
        coord = tree.findall(element_text)
        row = [cc.text for cc in coord]
        data.append(row)

df = pd.DataFrame(data, columns=list('XYZ'), dtype=float)
print(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM