簡體   English   中英

Web 用 python 抓取,request.json() 顯示 status_code 為 200 但無法提取 json 數據

[英]Web scraping with python, request.json() shows status_code of 200 but can not extract json data

我正在嘗試使用 python 來抓取 Shopee 商品信息。

采用https://SHOPEE.Z4D236D9A2D9A2D102C5FE6AD1C50DA4BEC50DA4BEC50Z.MY/ALL%20IN%20IN%20IN%20ONE%20ONE%20PC%20PCPITILL以%2023.8%20Inch%20computer%20Office%20Desktop%20All-in-one%20desktop%20Support%20WiFi-i.206039726.5859069631為例。

As it is using ajax, I am trying to extract it from: https://shopee.com.my/api/v2/item/get?itemid=5859069631&shopid=206039726

當我在瀏覽器中復制上述鏈接時,它可以很好地處理我需要的所有信息。 但是當我嘗試使用 request.get() 獲取它時,它會響應一個沒有實際數據的 json;

{'item': None, 'version': 'be8962b139db1273b88c291407137744', 'data': None, 'error_msg': None, 'error': None}

我的代碼:

url = 'https://shopee.com.my/api/v2/item/get?itemid=5859069631&shopid=206039726'

response = requests.get(url)

if response.status_code == 200:

    item_info = response.json()
    
    
print(item_info)

奇怪的是,當我試圖提取商店信息時,代碼可以完美地與: url = 'https://shopee.com.my/api/v4/product/get_shop_info?shopid=206039726' 配合使用。

不知道為什么以及我應該如何解決這個問題。 非常感謝!!

添加User-Agent HTTP header 以獲得正確的結果:

import json
import requests

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
}
url = "https://shopee.com.my/api/v2/item/get?itemid=5859069631&shopid=206039726"

response = requests.get(url, headers=headers)

if response.status_code == 200:
    item_info = response.json()

print(json.dumps(item_info, indent=4))

印刷:

{
    "item": {
        "itemid": 5859069631,
        "price_max_before_discount": -1,
        "item_status": "n",
        "can_use_wholesale": false,
        "brand_id": null,
        "show_free_shipping": true,
        "estimated_days": 6,
        "is_hot_sales": false,
        "is_slash_price_item": false,
        "upcoming_flash_sale": null,
        "slash_lowest_price": null,
        "is_partial_fulfilled": false,
        "condition": 2,
        "show_original_guarantee": true,
        "add_on_deal_info": null,
        "is_non_cc_installment_payment_eligible": false,
        "categories": [
            {
                "display_name": "Computer & Access",
                "catid": 340,
                "image": null,
                "no_sub": true,
                "is_default_subcat": true,
                "block_buyer_platform": null
            },
            {
                "display_name": "Des",
                "catid": 17578,
                "image": null,
                "no_sub": false,
                "is_default_subcat": true,
                "block_buyer_platform": null
            },
            {
                "display_name": "All-in-one Des",
                "catid": 20050,
...

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM