简体   繁体   English

Python:为什么我使用 beautifulsoup 的谷歌网页抓取代码没有返回搜索结果?

[英]Python: Why isn't my google webscraping code using beautifulsoup returning the search results?

I'm trying to write a python script to show me the links to the top 5 results in google for a given search query.我正在尝试编写 python 脚本,以向我显示给定搜索查询的 google 前 5 个结果的链接。

I'm using beautiful soup, and after inspecting the html for google, I found that the search result links can be found inside the tags 'div class="r"' and 'a href'.我正在使用美丽的汤,在为 google 检查 html 后,我发现可以在标签 'div class="r"' 和 'a href' 中找到搜索结果链接。

import bs4, requests

mySearch=input()
address='http://www.google.com/search?q='+mySearch
googleRes=requests.get(address)

googleSoup=bs4.BeautifulSoup(googleRes.text)
linkBlocks=googleSoup.select('div.r a')

However, the list, linkBlocks is empty, instead of being filled with the search result links.但是,列表 linkBlocks 是空的,而不是用搜索结果链接填充。 How do I get the search result links into the linkBlocks list.如何将搜索结果链接放入 linkBlocks 列表。

Use User-Agent使用User-Agent

import bs4, requests
headers = {'User-Agent':
       'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
mySearch="beautifulsoup"
address='http://www.google.com/search?q='+mySearch
googleRes=requests.get(address,headers=headers)
googleSoup=bs4.BeautifulSoup(googleRes.text,'html.parser')
linkBlocks=googleSoup.select('div.r a')
print(linkBlocks)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM