简体   繁体   English

有没有办法使用 selenium webdriver (python) 捕获样式数据?

[英]Is there a way to capture style data using selenium webdriver (python)?

I'm currently looking to pull the specific typography a company uses on stylify me (eg for http://stylifyme.com/?stylify=uber.com i want to pull "UberMove, 'Open Sans', 'Helvetica Neue', Helvetica, sans-serif, normal, 52px, 56px, #000000").我目前正在寻找公司在风格化我上使用的特定排版(例如对于http://stylifyme.com/?stylify=uber.com我想拉“UberMove,'Open Sans','Helvetica Neue', Helvetica、无衬线、普通、52px、56px、#000000")。 However, I'm running into issues when it comes to finally pulling the text - the text shows in the html but does not appear when i try to pull the text.但是,我在最终提取文本时遇到了问题 - 文本显示在 html 中,但在我尝试提取文本时没有出现。 I've tried pulling both the Inner HTML and just the text - see example code and text below.我试过同时提取内部 HTML 和文本 - 请参阅下面的示例代码和文本。

page=webdriver.Chrome('/Downloads/chromedriver.exe')
page.get('http://stylifyme.com/')
website_finder=page.find_element_by_id('input-stylify')
website_finder.send_keys('www.bcg.com')
website_finder.submit()

#try 1:
print(page.find_element_by_id("result-header-1-dt").text)
#output 1: "Header 1: Font, Style, Size, Leading, Colour"

#try 2
print(page.find_element_by_xpath('/html/body/div[1]/table/tbody/tr[1]/th/strong').get_attribute("innerHTML"))
#output 2: "Header 1:"


HTML code: HTML代码:

<th id="result-header-1-dt" class="first" scope="row"><strong style="opacity: 1;">
UberMove, 'Open Sans', 'Helvetica Neue', Helvetica, sans-serif, normal, 52px, 56px, #000000
</strong> <span style="opacity: 1;">Font, Style, Size, Leading, Colour</span></th>

Any help would be greatly appreciated!任何帮助将不胜感激!

As pguardiario mentioned, the solution is to wait for the element to be loaded.正如 pguardiario 所提到的,解决方案是等待元素被加载。 Using time.sleep(5) works fine much of the time, but often using a WebDriverWait can work better.使用time.sleep(5)大部分时间都可以正常工作,但经常使用WebDriverWait可以更好地工作。 time.sleep sleeps for a set amount of time, which can lead to unnecessary pauses in running the script, or failures if a page takes a really long time to load. time.sleep休眠一段时间,这可能会导致运行脚本时出现不必要的暂停,或者如果页面加载时间过长,则会导致失败。 WebDriverWait helps keep scripts running by finishing once the element is found. WebDriverWait通过在找到元素后完成来帮助保持脚本运行。 If the element is never found, then an Exception will be thrown.如果从未找到该元素,则将抛出异常。

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import WebDriverWait

driver = ...
max_wait_time = 5
selector = ...
by = By.XPATH  # Or By.ID, By.CSS_SELECTOR, etc.

try:
    WebDriverWait(driver, max_wait_time).until(ec.presence_of_element_located((by, selector)))
except TimeoutException:
    print("Failed to find an element with", selector, "in", max_wait_time, "seconds)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM