如何在 Python 3 中使用 Selenium 从网站的某个部分获取文本

Question

I was wondering how I can pull text from a website using Selenium and Python 3. I don't know what the text is, so I can't just look for the sentence and copy it.我想知道如何使用 Selenium 和 Python 3 从网站中提取文本。我不知道文本是什么，所以我不能只查找句子并复制它。 Here is an example screenshot: Example Problem.这是一个示例屏幕截图：示例问题。 Know in this scenario I am looking for the small amount of text right after the 1. but it is represented by just ::header, so I am having trouble grabbing it.知道在这种情况下，我正在寻找 1 之后的少量文本。但它仅由 ::header 表示，所以我很难抓住它。 Any ideas?有任何想法吗？ Thanks!谢谢！ Also the website I am pulling from is Quia.我从中提取的网站也是Quia。

Thanks!谢谢！

Answer 1

It's hard to answer directly because this web example is behind login.很难直接回答，因为这个 Web 示例在登录之后。 Broadly speaking you may use xpath expressions which needs information about xml/html tree(In example available under F12 button on PC keyboard when using Chrome or Firefox. „Inspect” from contex mouse menu is also the way).一般来说，您可以使用需要有关 xml/html 树的信息的 xpath 表达式（例如，在使用 Chrome 或 Firefox 时，PC 键盘上的 F12 按钮下可用。从上下文鼠标菜单中“检查”也是一种方式）。 Example on login page of same server to get welcome text:在同一服务器的登录页面上获取欢迎文本的示例：

from selenium import webdriver
from selenium.webdriver.common.by import By

def s_obj(sel_drv, xph):
    return sel_drv.find_elements(by=By.XPATH, value = f"{xph}")

def s_text(sel_drv, xph):
    els = s_obj(sel_drv, xph)
    return '; '.join(el.text.replace('\n', '; ')\
        for el in els).strip(';').strip() if els else ''

test_url = "https://www.quia.com/web"

sel_drv = webdriver.Chrome()
sel_drv.get(test_url)
bs_xph = "//*/table/tbody/tr/td[@colspan=\"5\"]/h1[@class=\"home\"]"
expected_txt = s_text(sel_drv, f"{bs_xph}[1]")
print(expected_txt)
sel_drv.quit()

如何在 Python 3 中使用 Selenium 从网站的某个部分获取文本

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-05-29 19:28:35

如何在 Python 3 中使用 Selenium 从网站的某个部分获取文本

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-05-29 19:28:35

解决方案1
0 已采纳 2022-05-29 19:28:35