如何使用 CSS 选择器来检索使用 BeautifulSoup 的特定链接？

Question

Im using python for scraping the following page: alfabeta.surge.sh and i would like to get the link in (#home1 > div:nth-child(10) > table:nth-child(29) > tbody > tr:nth-child(1) > td:nth-child(3) > a )我使用 python 来抓取以下页面： alfabeta.surge.sh ，我想在 (#home1 > div:nth-child(10) > table:nth-child(29) > tbody > tr:nth 中获取链接-child(1) > td:nth-child(3) > a)

Actually im doing this:实际上我正在这样做：

import bs4, requests
res = requests.get('https://alfabeta.surge.sh/')
soup = bs4.BeautifulSoup(res.text, 'html.parser')
soup.find_all('a')[23].attrs.get('href')

But if the position of the change i cant download the content但是如果更改的 position 我无法下载内容

Answer 1

You will need to make some assumptions about what is most likely to remain constant, and then review over time.您需要对最有可能保持不变的内容做出一些假设，然后随着时间的推移进行审查。 For example, I might assume you want the 3rd column td 's child a tag href , from the table which is the first following the div with containing the string Catálogo Actualizaciones .例如，我可能假设您希望第 3 列td的子项a标签href ，该table是div后面的第一个包含字符串Catálogo Actualizaciones的表。 One css pattern for that would be as follows:一种 css 模式如下：

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://alfabeta.surge.sh/')
soup = bs(r.text, 'lxml')
print(soup.select_one('div:-soup-contains("Catálogo Actualizaciones") ~ table td:nth-child(3) > a')['href'])

如何使用 CSS 选择器来检索使用 BeautifulSoup 的特定链接？

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-06-15 05:30:42

如何使用 CSS 选择器来检索使用 BeautifulSoup 的特定链接？

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-06-15 05:30:42

解决方案1
0 已采纳 2021-06-15 05:30:42