如何使用带有Python的Selenium Webdriver从工具提示中获取没有属性的文本？

Question

I have a web element with a tooltip that shows the following message: ● Client Book Revenue $20,966,618 我有一个带有工具提示的Web元素，其中显示以下消息：●客户账簿收入$ 20,966,618

The HTML code for that tooltip is below. 该工具提示的HTML代码如下。 I am able to hover over the web element using Selenium Webdriver which makes the tooltip visible, but I can't figure out how to get the text from it. 我可以使用Selenium Webdriver将鼠标悬停在web元素上，这使工具提示可见，但是我不知道如何从中获取文本。 Could somebody please help? 有人可以帮忙吗？

<div class="highcharts-tooltip" style="position: absolute; left: 755px; top: 0px; display: block; opacity: 1; pointer-events: none; visibility: visible;">
    <span style="position: absolute; font-family: "Roboto",sans-serif; font-size: 12px; white-space: nowrap; color: rgb(51, 51, 51); margin-left: 0px; margin-top: 0px; left: 0px; top: 0px;">
        <div class="client-rate-bench-chart">
            <table class="table rdo-table-tooltip">
                <tbody>
                    <tr>
                        <td>
                            <span style="color:rgba(45,108,162,1)">●</span>
                           Client Book Revenue
                        </td>
                        <td> $20,966,618 </td>
                    </tr>
                </tbody>
           </table>
        </div>
    </span>
</div>

Answer 1

You can grab the table and then grab the first instance of <tr> 您可以获取表，然后获取<tr>的第一个实例

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Firefox()
driver.get(URL)
html = driver.page_source # this is how you get the HTML

soup = BeautifulSoup(html)
table = soup.find('table', class_='rdo-table-tooltip')
tooltip = table.find('tr')
text = tooltip.text

text will have a lot of extra whitespace because of how the HTML is formatted, but you can strip that out - just split on all whitespace and then re-join the elements like this 由于HTML的格式设置， text将有很多额外的空格，但是您可以将其删除-只需在所有空格上分割，然后重新加入这样的元素

final_text = ' '.join(text.split())
print final_text
# ● Client Book Revenue $20,966,618

For multiple <tr> s you can use .find_all('tr') and then use a list comprehension to get a list of the contents of the rows. 对于多个<tr> ，可以使用.find_all('tr') ，然后使用列表.find_all('tr')来获取行内容的列表。 It would look something like this 看起来像这样

soup = BeautifulSoup(html)
table = soup.find('table', class_='rdo-table-tooltip')
tooltips = table.find_all('tr')
text = [' '.join(tooltip.text.split()) for tooltip in tooltips]

Then text will be a list of strings containing the text from each <tr> 然后，文本将是一个字符串列表，其中包含每个<tr>中的文本

Answer 2

As an alternative you could use re.findall to return all the instances of text between tags. 或者，您可以使用re.findall返回标签之间的所有文本实例。 This will involve some cleaning up afterwards but I have found it pretty handy in general when working with Selenium. 此后将需要进行一些清理，但是我发现在使用Selenium时通常非常方便。

import re

tooltips = re.findall('<tr>(.*?)<tr>', html.replace('\n', ''))

for tooltip in tooltips:
    print tooltip

如何使用带有Python的Selenium Webdriver从工具提示中获取没有属性的文本？

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-04-12 22:14:06

解决方案2
0 2017-04-12 22:31:15

如何使用带有Python的Selenium Webdriver从工具提示中获取没有属性的文本？

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-04-12 22:14:06

解决方案2 0 2017-04-12 22:31:15

解决方案1
1 已采纳 2017-04-12 22:14:06

解决方案2
0 2017-04-12 22:31:15