如何访问给定的 xpath 结果？

Question

I'm trying to scrape a web page like this我正在尝试抓取这样的网页

<html>
etc etc..
<div id='due'>
    <h2>title</h2>
    <div>
        <div class='desc'>
           sub1
        </div>
    </div>
    <div>
        <div class='desc'>
           sub2
        </div>
    </div>
    <div>
        <div class='desc'>
           subn
        </div>
    </div>
    <h2>title2</h2>
    <div>
        <div class='desc'>
           sub1
        </div>
    </div>
    <div>
        <div class='desc'>
           sub2
        </div>
    </div>
    <div>
        <div class='desc'>
           subn
        </div>
    </div>
</div>
etc etc..
</html>

I first tried to scrape the section:我首先尝试刮取该部分：

box = tree.xpath('//*[@id="due"]/*')

then:然后：

for div in box:
    print(div.tag)

It returns correctly every first tag of every element, but if:它正确返回每个元素的每个第一个标签，但如果：

for div in box:
    if div.tag == 'div':
        print(div.xpath('//div[@class="desc"]').text)

Make the same search n times from start document and not from every individual 'div'从起始文档而不是每个单独的“div”进行 n 次相同的搜索

I would expect:我希望：

sub1
sub2
subn
sub1
sub2
subn

It returns, list doesn't have ".text" property but if I print every list:它返回，列表没有“.text”属性，但如果我打印每个列表：

[sub1, sub2, subn, sub1, sub2, subn]
[sub1, sub2, subn, sub1, sub2, subn]
[sub1, sub2, subn, sub1, sub2, subn]
[sub1, sub2, subn, sub1, sub2, subn]
[sub1, sub2, subn, sub1, sub2, subn]
[sub1, sub2, subn, sub1, sub2, subn]

Yep you would think that I should run once the code but I need make some variations on every iteration and create data relations, so how can I fix this?是的，您会认为我应该运行一次代码，但我需要对每次迭代进行一些更改并创建数据关系，那么我该如何解决这个问题？

Thank you in advanced提前谢谢

Answer 1

最后我没有用xpath解决问题，我只是搬到了bs4

Answer 2

For future reference, to solve your problem with xpath try this:为了将来参考，要使用 xpath 解决您的问题，请尝试以下操作：

import lxml.html as lh
scr = """[your html above]"""
doc = lh.fromstring(scr)
for t in doc.xpath('//div[@id="due"]//div[@class="desc"]/text()'):
    print(t.strip())

Output:输出：

sub1
sub2
subn
sub1
sub2
subn

如何访问给定的 xpath 结果？

问题描述

2 个解决方案

解决方案1
0 已采纳 2020-11-17 22:23:15

解决方案2
0 2020-11-18 13:45:11

如何访问给定的 xpath 结果？

问题描述

2 个解决方案

解决方案1 0 已采纳 2020-11-17 22:23:15

解决方案2 0 2020-11-18 13:45:11

解决方案1
0 已采纳 2020-11-17 22:23:15

解决方案2
0 2020-11-18 13:45:11