简体   繁体   中英

BeautifulSoup can't crawl google search results?

Trying to crawl through google search results. This code works pretty well with all the other sites, I have tried, however not working with google. It returns an empty list.

from BeautifulSoup import BeautifulSoup
import requests

def googlecrawler(search_term):
    url="https://www.google.co.in/?gfe_rd=cr&ei=UVSeVZazLozC8gfU3oD4DQ&gws_rd=ssl#q="+search_term
    junk_code=requests.get(url)
    ok_code=junk_code.text
    good_code=BeautifulSoup(ok_code)
    best_code=good_code.findAll('h3',{'class':'r'})
    print best_code


googlecrawler("healthkart") 

It should return something like this.

<h3 class="r"><a href="/url?  sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=6&amp;cad=rja&amp;uact=8&amp;ved=0CEIQFjAF&amp;url=http%3A%2F%2Fwww.coupondunia.in%2Fhealthkart&amp;ei=qFmfVc2fFNO0uASti4PwDQ&amp;usg=AFQjCNFHMzqn-rH4Hp-fZK0E4wwxJmevEg&amp;sig2=QgwxMBdbPndyQTSH10dV2Q" onmousedown="return rwt(this,'','','','6','AFQjCNFHMzqn-rH4Hp-fZK0E4wwxJmevEg','QgwxMBdbPndyQTSH10dV2Q','0CEIQFjAF','','',event)" data-href="http://www.coupondunia.in/healthkart">HealthKart Coupons: July 2015 Coupon Codes</a></h3>

Looking at good_code i can't see a h3 or class "r" at all. That would be why your code is returning an empty list.

There is no problem with your code as such, but rather, that what you are searching for is not there.

What were you expecting to return?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM