简体   繁体   English

如何使用 python 从 JSON object 中提取特定数据?

[英]How to extract specific data from JSON object using python?

I'm trying to scrape a website and get items list from it using python.我正在尝试使用python 抓取网站并从中获取项目列表。 I parsed the html using BeaufitulSoup and made a JSON file using json.loads(data).我使用 BeaufitulSoup 解析了 html 并使用 json.loads(data) 创建了 JSON 文件。 The JSON object looks like this: JSON object 看起来像这样:

{ ".1768j8gv7e8__0":{ 
    "context":{ 
       //some info
    },
    "pathname":"abc",
    "showPhoneLoginDialog":false,
    "showLoginDialog":false,
    "showForgotPasswordDialog":false,
    "isMobileMenuExpanded":false,
    "showFbLoginEmailDialog":false,
    "showRequestProductDialog":false,
    "isContinueWithSite":true,
    "hideCoreHeader":false,
    "hideVerticalMenu":false,
    "sequenceSeed":"web-157215950176521",
    "theme":"default",
    "offerCount":null
 },
 ".1768j8gv7e8.6.2.0.0__6":{ 
    "categories":[ 

    ],
    "products":{ 
       "count":12,
       "items":[ 
          { 
             //item info
          },
          { 
            //item info
          },
          { 
            //item info
          }
       ],
       "pageSize":50,
       "nextSkip":100,
       "hasMore":false
    },
    "featuredProductsForCategory":{ 

    },
    "currentCategory":null,
    "currentManufacturer":null,
    "type":"Search",
    "showProductDetail":false,
    "updating":false,
    "notFound":false
 }
}

I need the items list from product section.我需要产品部分的项目列表。 How can I extract that?我怎样才能提取它?

Just do:做就是了:

products = jsonObject[list(jsonObject.keys())[1]]["products"]["items"]

import json packagee and map every entry to a list of items if it has any:json packagee 和 map 导入到项目列表的每个条目(如果有):

This solution is more universal, it will check all items in your json and find all the items without hardcoding the index of an element此解决方案更通用,它将检查 json 中的所有项目并找到所有项目,而无需对元素的索引进行硬编码

import json

data = '{"p1": { "pathname":"abc" },  "p2": { "pathname":"abcd", "products": { "items" : [1,2,3]} }}'

# use json package to convert json string to dictionary
jsonData = json.loads(data)
type(jsonData) # dictionary

# use "list comprehension" to iterate over all the items in json file
# itemData['products']["items"] - select items from data
# if "products" in itemData.keys() - check if given item has products 
[itemData['products']["items"] for itemId, itemData in jsonData.items() if "products" in itemData.keys()]

Edit: added comments to code编辑:在代码中添加注释

I'll just call the URL of the JSON file you got from BeautifulSoup " response " and then put in a sample key in the items array, like itemId :我将调用您从 BeautifulSoup 获得的 JSON 文件的 URL “ response ”,然后在items数组中放入示例键,例如itemId

import json
json_obj = json.load(response)
array = []
for i in json_obj['items']:
   array[i] = i['itemId']
print(array)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM