简体   繁体   English

Python-遍历嵌套的json并保存值

[英]Python - iterate through nested json and save values

I have a nested JSON (API) webstie which i want to parse and save items to file (using Scrapy framework). 我有一个嵌套的JSON(API)Webstie,我想解析该项目并将其保存到文件(使用Scrapy框架)。

I want to access each subelement of given elements, those are in following format 我想访问给定元素的每个子元素,它们的格式如下

0   {…}
1   {…}
2   {…}
3   {…}
4   {…}
5   {…}
6   {…}
7   {…}
8   {…}
9   {…}
10  {…}

If I expand element 0 i get following values, where {...} exapnds further 如果我将元素0展开,则会得到以下值,其中{...}会进一步扩展

id  6738
date    "2018-06-14T09:38:51"
date_gmt    "2018-06-14T09:38:51"
guid    
     rendered   "https:example.com"
modified    "2019-03-19T20:43:50"
modified_gmt    "2019-03-19T20:43:50"

How does it look like in reality 现实情况如何

How do I access, consecutively, each element, first 0, then 1, then 2 ... up to total of 350 and grab value of, for example 我如何连续访问每个元素,首先是0,然后是1,然后是2 ... ...总计达到350,并获取例如的值

guid   
    rendered "https//:example.com"

and save it to item. 并将其保存到项目。

What I have: 我有的:

       results = json.loads(response.body_as_unicode())
       item = DataItem()
       for var in results:
           item['guid'] = results["guid"]
       yield item

This fails with 这失败了

TypeError: list indices must be integers, not str

I know that i can access it with 我知道我可以使用

item['guid'] = results[0]["guid"]

But this only gives me [0] index of the whole list and I want to iterate through all of indexes. 但这只给了我整个列表的[0]索引,我想遍历所有索引。 How do I pass index number inside of the list? 如何在列表中传递索引号?

Replace results["guid"] in your for loop to var["guid"] : 将for循环中的results["guid"]替换为var["guid"]

for var in results:
    item['guid'] = var["guid"]
    # do whatever you want with item['guid'] here

when you can access guid like results[0]["guid"] it means that you have list of dictionaries that every dictionary contains key named guid . 当您可以像results[0]["guid"]一样访问guid时,这意味着您拥有字典列表,每个字典都包含名为guid键。 in your for loop you use results (that is list) instead of var (that contain every dictionary in each iteration) that throws TypeError because list indices must be integers not strings (like "guid" ). 在for循环中,使用results (即列表)而不是抛出TypeErrorvar (每次迭代中包含每个词典)的var ,因为列表索引必须是整数而不是字符串(例如"guid" )。

UPDATE: if you want to save each var["guid"] you can save them in a dictionary like this: 更新:如果要保存每个var["guid"] ,可以将其保存在这样的字典中:

guid_holder = {"guid": []}
for var in results:
    guid_golder["guid].append(var["guid"])
for guid in guid_holder["guid"]:
    print(guid)

now guid_holder holds all elements. 现在guid_holder包含所有元素。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM