简体   繁体   English

试图从 JSON URL 中提取数据到 Pandas

[英]Trying to extract data from JSON URL into Pandas

I am trying to extract data from a JSON URL into pandas but this file has multiple "layers" of lists and dictionaries which i just cannot seem to navigate.我正在尝试从 JSON URL 中提取数据到 pandas 但这个文件有多个列表和字典的“层”,我似乎无法导航。

import json
from urllib.request import urlopen

with urlopen('https://statdata.pgatour.com/r/010/2020/player_stats.json') as response:
    source = response.read()

data = json.loads(source)

for item in data['tournament']['players']:
    pid = item['pid']
    statId = item['stats']['statId']
    name = item['stats']['name']
    tValue = item['stats']['tValue']
    print(pid, statId, name, tValue)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-84-eadd8bdb34cb> in <module>
      1 for item in data['tournament']['players']:
      2     player_id = item['pid']
----> 3     stat_id = item['stats']['statId']
      4     stat_name = item['stats']['name']
      5     stat_value = item['stats']['tValue']

TypeError: list indices must be integers or slices, not str

The output i am trying to get to is like:-我试图到达的 output 就像:-

在此处输入图像描述

You are missing a layer.你少了一层。

To simplify the data, we are trying to access:为了简化数据,我们尝试访问:

"stats": [{
    "statId":"106",
    "name":"Eagles",
    "tValue":"0",
}]

The data of 'stats' starts with [{ . 'stats' 的数据以[{开头。 This is a dictionary within an array.这是数组中的字典。

I think this should work:认为这应该有效:

for item in data['tournament']['players']:
    pid = item['pid']
    for stat in item['stats']:
        statId = stat['statId']
        name = stat['name']
        tValue = stat['tValue']
        print(pid, statId, name, tValue)

To read more on dictionaries: https://realpython.com/iterate-through-dictionary-python/要阅读有关词典的更多信息: https://realpython.com/iterate-through-dictionary-python/

As the previous answer suggests, stats is a list of stat items.正如前面的答案所暗示的, statsstat项目的列表。 This will show you what happens, and aslo catch any other problems:这将向您展示发生了什么,并发现任何其他问题:

import json
from urllib.request import urlopen

with urlopen('https://statdata.pgatour.com/r/010/2020/player_stats.json') as response:
    source = response.read()

data = json.loads(source)

for item in data['tournament']['players']:
    try:
        pid = item['pid']
        stats = item['stats']
        for stat in stats:
            statId = stat['statId']
            name = stat['name']
            tValue = stat['tValue']
            print(pid, statId, name, tValue)
     except Exception as e:
        print(e)
        print(item)
        break

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM