Python Selenium 获取属性“href”错误

Question

I am trying to get href from the link, please find my codes:我正在尝试从链接中获取 href，请找到我的代码：

url ='http://money.finance.sina.com.cn/bond/notice/sz149412.html'
link = driver.find_element_by_xpath("//div[@class='blk01'])//ul//li[3]//a[contains(text(),'发行信息']").get_attribute('href')
print(link)

error错误

 invalid selector: Unable to locate an element with the xpath expression 
SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//div[@class='blk01'])//ul/li[3]//a[contains(text(),'发行信息']' is not a valid XPath expression.

Seems it is not a valid xpath, but I cannot figure out the error, any help will be appreciated!似乎它不是有效的 xpath，但我无法找出错误，任何帮助将不胜感激！

Thanks谢谢

Answer 1

try this instead:试试这个：

link = driver.find_element_by_xpath('//div[@class="blk01"]//ul//li[3]//a[contains(text(), "发行信息")]')
print(link.get_attribute("href"))

Answer 2

//a[contains(text(),'发行信息')]

Even this would work.即使这样也行。

print(link.get_attribute("href"))

Answer 3

# Importing necessary modules
from seleniumwire import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import time

# WebDriver Chrome
driver = webdriver.Chrome(ChromeDriverManager().install())

# Target URL
url = 'http://money.finance.sina.com.cn/bond/notice/sz149412.html'
driver.get(url)
time.sleep(5)
link = driver.find_element_by_xpath('//*[@class="blue" and contains(text(),"发行信息")]').get_attribute('href')
print(link)

Answer 4

//div[@class='blk01'])//ul//li[3]//a[contains(text(),'发行信息']

does not seem to be a stable xpath and also you mess up with ' and " . This is the main problem.似乎不是一个稳定的 xpath 并且你搞砸了 ' 和 " 。这是主要问题。

Try this first:先试试这个：

find_element_by_xpath('//div[@class="blk01"])//ul//li[3]//a[contains(text(),"发行信息"]')

If it works, try just:如果有效，请尝试：

find_element_by_xpath('//a[contains(text(),"发行信息"]')

The goal is to make xpath as short as possible.目标是使xpath尽可能短。

Answer 5

Any particular reason to use Selenium here?在这里使用 Selenium 有什么特别的理由吗？ It's present in the html source, so would be more efficient to use requests and beautifulsoup .它存在于 html 源中，因此使用requests和beautifulsoup会更有效。

import requests
from bs4 import BeautifulSoup

url = 'http://money.finance.sina.com.cn/bond/notice/sz149412.html'
response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')


a_tag = soup.select_one('a:contains("发行信息")') 
#a_tag = soup.select_one('a:-soup-contains("发行信息")') # <- depending what version of bs4 you have, the above may throw error since it's depricated

link = a_tag['href']

Ouput:输出：

print(link)
http://money.finance.sina.com.cn/bond/issue/sz149412.html

Python Selenium 获取属性“href”错误

问题描述

5 个解决方案

解决方案1
1 2021-03-29 09:56:12

解决方案2
1 2021-03-29 10:00:39

解决方案3
0 2021-03-29 10:21:21

解决方案4
0 2021-03-29 16:04:50

解决方案5
0 2021-04-02 08:58:42

Python Selenium 获取属性“href”错误

问题描述

5 个解决方案

解决方案1 1 2021-03-29 09:56:12

解决方案2 1 2021-03-29 10:00:39

解决方案3 0 2021-03-29 10:21:21

解决方案4 0 2021-03-29 16:04:50

解决方案5 0 2021-04-02 08:58:42

解决方案1
1 2021-03-29 09:56:12

解决方案2
1 2021-03-29 10:00:39

解决方案3
0 2021-03-29 10:21:21

解决方案4
0 2021-03-29 16:04:50

解决方案5
0 2021-04-02 08:58:42