简体   繁体   English

刷新后如何从网页中抓取数据

[英]How to scrape data from a webpage after refresh

I wrote a script that gets all of the products in a online shoping website.我编写了一个脚本来获取在线购物网站中的所有产品。

Website gives you more items whenever you scroll down as usualy, so I can't get enough products from the webpage.每当您像往常一样向下滚动时,网站会为您提供更多项目,因此我无法从网页上获得足够的产品。

How can I get "product" as much as I want?我怎样才能得到我想要的“产品”?

Here is my current code:这是我当前的代码:

from bs4 import BeautifulSoup
import requests 

url = "https://www.trendyol.com/erkek-t-shirt-x-g2-c73"
html_text = requests.get(url).text
main_soup = BeautifulSoup(html_text, 'lxml')
all_items = main_soup.find_all('div', class_="p-card-wrppr with-campaign-view")

for item in all_items:
    title = item.find('div', class_="prdct-desc-cntnr-ttl-w two-line-text").text
    print(title)
    print()

The data is loaded via Javascript through their Rest API so you can make the same request to obtain the information:数据通过 Javascript 通过他们的 Rest API 加载,因此您可以发出相同的请求来获取信息:

import requests

api_url = "https://public.trendyol.com/discovery-web-searchgw-service/v2/api/infinite-scroll/erkek-t-shirt-x-g2-c73"

params = {
    "pi": "1",
    "culture": "tr-TR",
    "userGenderId": "1",
    "pId": "0",
    "scoringAlgorithmId": "2",
    "categoryRelevancyEnabled": "false",
    "isLegalRequirementConfirmed": "false",
    "searchStrategyType": "DEFAULT",
    "productStampType": "A",
    "fixSlotProductAdsIncluded": "false",
    "searchAbDeciderValues": "",
}


page = 1
while True:
    params['pi'] = page
    data = requests.get(api_url, params=params).json()

    if not data.get('result', {}).get('products'):
        break

    for p in data['result']['products']:
        print('{:<15} {}'.format(p['id'], p['name']))
    print()

    page += 1

Prints:印刷:

311550273       Erkek Nem Emici Hızlı Kuruma Atletik Teknik Performans T-shirt
114225365       Erkek Sarı Mikro Polyester Performans Antrenman Sporcu Tişört
35907408        Dry Park VII BV6708-010 Erkek Tişört
101771784       Erkek Siyah-Beyaz-Antrasit 3'lü Bisiklet Yaka Düz T-Shirt E001010
39501961        Erkek Siyah Pis Yaka Salaş T-shirt
311550264       Erkek Nem Emici Hızlı Kuruma Atletik Teknik Performans T-shirt
95401172        Erkek Koyu Lacivert Pike Kısa Kollu Basic Tişört
339923963       Fit NBA Brooklyn Nets Regular Fit Bisiklet Yaka Tişört
62247632        Ua Big Logo Ss - 1329583-600
101770796       Erkek Siyah 2'li Bisiklet Yaka %100 Pamuk Basic T-Shirt E001011
270985631       Erkek Beyaz Pis Yaka Salaş T-shirt
96531669        Logo Baskılı Kırmızı Tişört Slim Fit / Dar Kesim 065781-33099
101771002       Erkek Beyaz 2'li Bisiklet Yaka %100 Pamuk Basic T-Shirt E001012
382858755       Sw Tshirt | Bordo
336669786       Fit NBA Brooklyn Nets Oversize Fit Kapüşonlu Tişört
339562039       Fit NBA Golden State Warriors Boxy Fit Bisiklet Yaka Tişört
347257450       Oversize Dragon Vs Phoenix Unisex T-shirt
443287890       Oversize Bisiklet Yaka Les Benjamınsbakılı Tshirt
340645587       TULIO GRİ T-SHIRT S/S TEE
348227065       Fenerbahçe Sk Unisex Mavi Futbol Tişört 77313601
443316559       Oversize Bisiklet Yaka Les Benjamınsbakılı Tshirt
293975462       Chaos Karma Baskı Oversize Siyah Unisex Tshirt
249703589       Unisex Chicago Özel Baskılı Oversize Penye T-shirt Tişört
301833907       Unisex Distraction Siyah Tshirt

302756840       Unisex First Class Beyaz Tshirt
382195053       Wreck The World Oversize | Beige
321828215       Erkek 5'li Paket Dry Fit Siyah Lacivert Beyaz Haki Gri Atletik Nem Emici Günlük Tshirt
90315375        Logo Baskılı Siyah Tişört Slim Fit / Dar Kesim 065781-900

...and so on.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM