繁体   English   中英

为什么我在抓取网站时没有得到结果

[英]Why am I not getting the results while scraping the website

当我尝试单独抓取项目时,我得到了结果,但是当我使用 try except 方法时,我没有得到结果。

from bs4 import BeautifulSoup as soup
import pandas as pd
import requests
import urllib
import requests, random

data =[]
url = 'https://www.flipkart.com/search?q=iphone'

def getdata (url):
    user_agents = [
      'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:77.0) Gecko/20100101 Firefox/77.0',
      'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'
    ]
    user_agent = random.choice(user_agents)
    header_ = {'User-Agent': user_agent}
    req = urllib.request.Request(url, headers=header_)
    amazon_html = urllib.request.urlopen(req).read()
    f_soup = soup(amazon_html,'html.parser')
    
    for e in f_soup.select('div[data-tkid="2f9a5ad9-8f64-4327-8a23-5069ba20a68b.MOBFWBYZBZ7Y56WD.SEARCH"]'):
        
        try:
            title = e.find('div',{'class':'_4rR01T'}).text
        except:
            title = None
            
        try:
            rating = e.find('span',{'class':'a-size-base s-underline-text'}).text
        except:
            rating = 0
            
        data.append({
            'Title':title,
            'Rating':rating
        })
        
    return data

getdata (url)

OUTPUT

[]

output = pd.DataFrame(data)
output

网站链接: https://www.flipkart.com/search?q=iphone&page=1

问题出在title and rating elements selection上。 现在它正在工作。

from bs4 import BeautifulSoup 
import pandas as pd
import requests


    data =[]
    url = 'https://www.flipkart.com/search?q=iphone'
    
    headers = {
        'User-Agent':'mozila/5.0'}
    req =requests.get(url, headers=headers)
    print(req)
    
    soup=BeautifulSoup(req.content,'lxml')
    for e in soup.select('.col.col-7-12'):
            
        try:
            title = e.select_one('._4rR01T').text
        except:
            title = None
                
        try:
            rating = e.select_one('._3LWZlK').find(text=True)
        except:
            rating = 0
                
        data.append({
            'Title':title,
            'Rating':rating
            
        })
            
      
    print(data)

Output:

[{'Title': 'APPLE iPhone SE (White, 64 GB)', 'Rating': '4.5'}, {'Title': 'APPLE iPhone SE 
(Red, 64 GB)', 'Rating': '4.5'}, {'Title': 'APPLE iPhone 12 Mini (Blue, 64 GB)', 'Rating': '4.5'}, {'Title': 'APPLE iPhone 13 mini (Green, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 11 (Red, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 12 Mini (Black, 64 GB)', 'Rating': '4.5'}, {'Title': 'APPLE iPhone 12 Mini (White, 64 GB)', 'Rating': '4.5'}, {'Title': 'APPLE iPhone 12 Mini (Black, 128 GB)', 'Rating': '4.5'}, {'Title': 'APPLE iPhone 13 Mini (Midnight, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone SE (White, 128 
GB)', 'Rating': '4.5'}, {'Title': 'APPLE iPhone 13 Mini ((PRODUCT)RED, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone SE (Red, 128 GB)', 'Rating': '4.5'}, {'Title': 'APPLE iPhone 13 Mini (Blue, 256 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 13 Mini (Pink, 128 
GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 13 Mini (Blue, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 13 mini (Green, 256 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 13 (Blue, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 13 Mini (Midnight, 256 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 12 Mini (Black, 256 GB)', 'Rating': '4.5'}, 
{'Title': 'APPLE iPhone 13 (Pink, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 13 (Midnight, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone XR (White, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 12 (Green, 128 GB)', 'Rating': '4.6'}, {'Title': 'APPLE iPhone 13 ((PRODUCT)RED, 256 GB)', 'Rating': '4.6'}]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM