[英]Python Selenium - Extract text within <br>
我目前正在遍歷所有標簽並從每個頁面中提取數據,但是我無法提取每個類別(即成立、位置等)下方突出顯示的文本。 文本似乎在“”和br標簽上方,有人可以建議如何提取嗎?
網站 - https://labelsbase.net/knee-deep-in-sound
<div class="line-title-block">
<div class="line-title-wrap">
<span class="line-title-text">Founded</span>
</div>
</div>
2003
<br>
<div class="line-title-block">
<div class="line-title-wrap">
<span class="line-title-text">Location</span>
</div>
</div>
<a href="/?c=United+Kingdom">United Kingdom</a>
<br>
我曾嘗試使用driver.find_elements_by_xpath
& driver.execute_script
但找不到解決方案。
錯誤信息 -
Message: invalid selector: The result of the xpath expression "/html/body/div[3]/div/div[1]/div[2]/div/div[1]/text()[2]" is: [object Text]. It should be an element.
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, NoSuchElementException
import pandas as pd
import time
import string
PATH = '/Applications/chromedriver'
driver = webdriver.Chrome(PATH)
wait = WebDriverWait(driver, 10)
links = []
url = 'https://labelsbase.net/knee-deep-in-sound'
driver.get(url)
time.sleep(5)
# -- Title
title = driver.find_element_by_class_name('label-name').text
print(title,'\n')
# -- Image
image = driver.find_element_by_tag_name('img')
src = image.get_attribute('src')
print(src,'\n')
# -- Founded
founded = driver.find_element_by_xpath("/html/body/div[3]/div/div[1]/div[2]/div/div[1]/text()[2]").text
print(founded,'\n')
driver.quit()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.