[英]Selecting links within a div tag using beautiful soup
I am trying to run the following code我正在尝试运行以下代码
headers = {
'User-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36'
}
params = {
'q': 'Machine learning,
'hl': 'en'
}
html = requests.get('https://scholar.google.com/scholar', headers=headers,
params=params).text
soup = BeautifulSoup(html, 'lxml')
for result in soup.select('.gs_r.gs_or.gs_scl'):
profiles=result.select('.gs_a a')['href']
The following output (error) is being shown "TypeError: list indices must be integers or slices, not str" What is it I am doing wrong?显示以下 output(错误)“TypeError:列表索引必须是整数或切片,而不是 str”我做错了什么?
The following is tested and works:以下内容经过测试并有效:
import requests
from bs4 import BeautifulSoup as bs
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.79 Safari/537.36'
}
params = {
'q': 'Machine learning',
'hl': 'en'
}
html = requests.get('https://scholar.google.com/scholar', headers=headers,
params=params).text
soup = bs(html, 'lxml')
for result in soup.select('.gs_r.gs_or.gs_scl'):
profiles=result.select('.gs_a a')
for p in profiles:
print(p.get('href'))
Result in terminal:终端结果:
/citations?user=rSVIHasAAAAJ&hl=en&oi=sra
/citations?user=MnfzuPYAAAAJ&hl=en&oi=sra
/citations?user=09kJn28AAAAJ&hl=en&oi=sra
/citations?user=yxUduqMAAAAJ&hl=en&oi=sra
/citations?user=MnfzuPYAAAAJ&hl=en&oi=sra
/citations?user=9Vdfc2sAAAAJ&hl=en&oi=sra
/citations?user=lXYKgiYAAAAJ&hl=en&oi=sra
/citations?user=xzss3t0AAAAJ&hl=en&oi=sra
/citations?user=BFdcm_gAAAAJ&hl=en&oi=sra
/citations?user=okf5bmQAAAAJ&hl=en&oi=sra
/citations?user=09kJn28AAAAJ&hl=en&oi=sra
In your code, you were trying to obtain the href
attribute from a list ( soup.select
returns a list, and soup.select_one
return a single element).在您的代码中,您试图从列表中获取
href
属性( soup.select
返回一个列表,而soup.select_one
返回一个元素)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.