简体   繁体   English

如何查找特定节点的 xpath 元素/标签

[英]How to find xpath elements/tags of a specific node

I have a page with the following structure...我有一个具有以下结构的页面...

<doc>
  <tbody>
   .
   .
   .
  <tbody>
    <tr>
       <td>
       .
       .
  </tbody>
  ....
</doc>

I'm able to get to the specific table I want with the xpath我可以使用 xpath 找到我想要的特定表

response.xpath('//tbody')[8].get()

but I'm struggling with the syntax to get elements/tags within tbody[8]... so far I've tried但我正在努力使用在 tbody[8] 中获取元素/标签的语法......到目前为止我已经尝试过

>>> response.xpath('//tbody')[8]/tr.get()
Traceback (most recent call last):
File "<console>", line 1, in <module>
NameError: name 'tr' is not defined

along with several other attempts but they all fail due to (I believe) syntax.以及其他几次尝试,但由于(我相信)语法,它们都失败了。 How can I get to tr and td tags inside tbody?如何访问 tbody 中的 tr 和 td 标签? No matter what I try I can't seem to add anything after tbody')[8] & I can't wrap my head around why...无论我尝试什么,我似乎都无法在tbody')[8]之后添加任何内容,而且我无法理解为什么......

You're on the right track, but you're going to need to provide the whole xpath string as an argument to the xpath() function, rather than trying to stick pieces of it outside.您走在正确的轨道上,但您需要提供整个 xpath 字符串作为xpath() function 的参数,而不是尝试将其中的一部分粘贴到外面。

The response.xpath('//tbody') is returning a list of elements matched by the xpath, and the [8] you have there is a Python index operator, not part of the xpath. response.xpath('//tbody')正在返回与 xpath 匹配的元素列表,并且[8]你有一个Python索引运算符,不是 Z3D788FA62D7C185A1ZEE4999 的一部分But then you're trying to continue writing an xpath after it, and it's just gibberish to Python.但是随后您尝试在其后继续编写 xpath,而这对 Python 来说只是胡言乱语。

If you take a look at some of the examples in https://docs.scrapy.org/en/latest/topics/selectors.html , you should be able to see what you're doing wrong.如果您查看https://docs.scrapy.org/en/latest/topics/selectors.html中的一些示例,您应该可以看到您做错了什么

Your /tr supposed to go in the same XPath string:您的/tr应该在同一 XPath 字符串中的 go :

response.xpath('//tbody[9]/tr').get()

Also note that despite XPath supports indexing like python, XPath index starts from 1 instead of 0 .另请注意,尽管 XPath 支持像 python 之类的索引,但 XPath 索引从1而不是0开始。 So if you could get the correct element using index [8] in python, you may want to use index [9] in the XPath expression因此,如果您可以使用 python 中的索引[8]获得正确的元素,您可能希望在 XPath 表达式中使用索引[9]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM