在多个位置使用find_elements_by_xpath

Question

Here is the HTML snippet: 这是HTML代码段：

<section class="node_category" id="kui_3_1515304072474_68">
    <h3 class="">User details</h3>
<ul class="" id="kui_3_1515304072474_67">
<li class="contentnode" id="kui_3_1515304072474_66">
<dl id="kui_3_1515304072474_65">
<dt class="">Country
</dt>
<dd class="" id="kui_3_1515304072474_64">United States
</dd>
</dl></li>
<li class="contentnode">
<dl>
<dt class="">City/town
</dt>
<dd class="">Somewhere
</dd>
</dl></li>
<li class="contentnode" id="kui_3_1515304072474_76">
<dl id="kui_3_1515304072474_75">
<dt class="">Company
</dt>
<dd class="" id="kui_3_1515304072474_74">ABC Inc
</dd>
</dl></li>
</ul></section>

I want to extract text from the following HTML class by XPath: 我想通过XPath从以下HTML类提取文本：

/ul/li[@class='contentnode'][3]/dl/dd

This "contentnode" class has multiple positions from 1 to maximum 6 for other pages. 对于其他页面，此“ contentnode”类具有从1到最大6的多个位置。 In this example, the maximum is 3. To designate all positions, I construct XPath like below: 在此示例中，最大值为3。要指定所有位置，我按如下方式构造XPath：

//li[@class='contentnode'][1 <= position() and position() < 7]/dl/dd

Now, I plug into my Python code like below: 现在，我像下面这样插入我的Python代码：

from selenium import webdriver


lst=[]
browser = webdriver.Chrome('./path')
url = "https://<target URL>"
browser.get(url)
contents = browser.find_elements_by_xpath("//li[@class='contentnode'][1 <= position() and position() < 7]/dl/dd")

for t in contents:

    lst.append([t.text])

print(lst)

However, the output only shows position 1. It should show all the text from the position 1 to 6. 但是，输出仅显示位置1。它应显示位置1到6的所有文本。

[Edit] Also I tried, [编辑]我也尝试过

//li[@class='contentnode'][contains(@id,'kui_3')]/dl/dd

but still does not work. 但仍然无法正常工作。 It does not show any error but the result is nothing. 它没有显示任何错误，但是没有任何结果。

What's wrong with my code? 我的代码有什么问题？

Answer 1

This is working code for your needs: 这是您需要的工作代码：

from selenium import webdriver


lst = []
browser = webdriver.Chrome()
browser.get("https://<target URL>")

contents = browser.find_elements_by_xpath("//li[@class='contentnode'][1 <= position() and position() < 7]/dl/dd")

for t in contents:

    lst.append(t.text)

print(lst)

browser.quit()

The result will be (according to your HTML): 结果将是（根据您的HTML）：

['United States', 'Somewhere', 'ABC Inc']

Hope it helps you! 希望对您有帮助！

Answer 2

Try below code 试试下面的代码

from selenium import webdriver

lst=[]
browser = webdriver.Chrome('./path')
url = "https://<target URL>"
browser.get(url)
contents = browser.find_elements_by_xpath("//li[@class='contentnode']/dl/dd")
print len(contents)

for t in contents:
    lst.append(t.text)

print(lst)

Answer 3

Did you try with css selector? 您尝试使用CSS选择器吗？ If not then you should give it a go: 如果没有，那么您应该尝试一下：

for items in browser.find_elements_by_css_selector(".contentnode"):
    data = ' '.join([' '.join(item.text.split()) for item in items.find_elements_by_css_selector("dd")])
    print(data)

在多个位置使用find_elements_by_xpath

问题描述

3 个解决方案

解决方案1
1 已采纳 2018-01-07 19:32:11

解决方案2
1 2018-01-08 02:32:26

解决方案3
0 2018-01-08 11:20:38

在多个位置使用find_elements_by_xpath

问题描述

3 个解决方案

解决方案1 1 已采纳 2018-01-07 19:32:11

解决方案2 1 2018-01-08 02:32:26

解决方案3 0 2018-01-08 11:20:38

解决方案1
1 已采纳 2018-01-07 19:32:11

解决方案2
1 2018-01-08 02:32:26

解决方案3
0 2018-01-08 11:20:38