[英]Trying to extract data from JSON URL into Pandas
I am trying to extract data from a JSON URL into pandas but this file has multiple "layers" of lists and dictionaries which i just cannot seem to navigate.我正在尝试从 JSON URL 中提取数据到 pandas 但这个文件有多个列表和字典的“层”,我似乎无法导航。
import json
from urllib.request import urlopen
with urlopen('https://statdata.pgatour.com/r/010/2020/player_stats.json') as response:
source = response.read()
data = json.loads(source)
for item in data['tournament']['players']:
pid = item['pid']
statId = item['stats']['statId']
name = item['stats']['name']
tValue = item['stats']['tValue']
print(pid, statId, name, tValue)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-84-eadd8bdb34cb> in <module>
1 for item in data['tournament']['players']:
2 player_id = item['pid']
----> 3 stat_id = item['stats']['statId']
4 stat_name = item['stats']['name']
5 stat_value = item['stats']['tValue']
TypeError: list indices must be integers or slices, not str
The output i am trying to get to is like:-我试图到达的 output 就像:-
You are missing a layer.你少了一层。
To simplify the data, we are trying to access:为了简化数据,我们尝试访问:
"stats": [{
"statId":"106",
"name":"Eagles",
"tValue":"0",
}]
The data of 'stats' starts with [{
. 'stats' 的数据以[{
开头。 This is a dictionary within an array.这是数组中的字典。
I think this should work:我认为这应该有效:
for item in data['tournament']['players']:
pid = item['pid']
for stat in item['stats']:
statId = stat['statId']
name = stat['name']
tValue = stat['tValue']
print(pid, statId, name, tValue)
To read more on dictionaries: https://realpython.com/iterate-through-dictionary-python/要阅读有关词典的更多信息: https://realpython.com/iterate-through-dictionary-python/
As the previous answer suggests, stats
is a list of stat
items.正如前面的答案所暗示的, stats
是stat
项目的列表。 This will show you what happens, and aslo catch any other problems:这将向您展示发生了什么,并发现任何其他问题:
import json
from urllib.request import urlopen
with urlopen('https://statdata.pgatour.com/r/010/2020/player_stats.json') as response:
source = response.read()
data = json.loads(source)
for item in data['tournament']['players']:
try:
pid = item['pid']
stats = item['stats']
for stat in stats:
statId = stat['statId']
name = stat['name']
tValue = stat['tValue']
print(pid, statId, name, tValue)
except Exception as e:
print(e)
print(item)
break
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.