简体   繁体   中英

Python: Why isn't my google webscraping code using beautifulsoup returning the search results?

I'm trying to write a python script to show me the links to the top 5 results in google for a given search query.

I'm using beautiful soup, and after inspecting the html for google, I found that the search result links can be found inside the tags 'div class="r"' and 'a href'.

import bs4, requests

mySearch=input()
address='http://www.google.com/search?q='+mySearch
googleRes=requests.get(address)

googleSoup=bs4.BeautifulSoup(googleRes.text)
linkBlocks=googleSoup.select('div.r a')

However, the list, linkBlocks is empty, instead of being filled with the search result links. How do I get the search result links into the linkBlocks list.

Use User-Agent

import bs4, requests
headers = {'User-Agent':
       'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
mySearch="beautifulsoup"
address='http://www.google.com/search?q='+mySearch
googleRes=requests.get(address,headers=headers)
googleSoup=bs4.BeautifulSoup(googleRes.text,'html.parser')
linkBlocks=googleSoup.select('div.r a')
print(linkBlocks)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM