[英]Nested JSON and Pandas v2
I asked a question yesterday about how to turn a JSON file into a dataframe, but I was asking the wrong question我昨天问了一个关于如何将 JSON 文件转换为数据框的问题,但我问错了问题
Nested JSON and Pandas 嵌套的 JSON 和 Pandas
I have a JSON file that looks like the below我有一个如下所示的 JSON 文件
There are two levels of keys (some times repeating and other times not)有两级键(有时重复,有时不重复)
{
"Abaddon the Despoiler": {
"Abaddon the Despoiler": {
"model_count": "1",
"points_value": "220\u2022",
"movement": "6\"",
"weapon_skill": "2+",
"ballistic_skill": "2+",
"strength": "5",
"toughness": "5",
"wounds": "8",
"attacks": "6",
"leadership": "10",
"save": "2+"
}
},
"Chaos Space Marines": {
"Chaos Space Marine": {
"model_count": "4-19",
"points_value": "14",
"movement": "6\"",
"weapon_skill": "3+",
"ballistic_skill": "3+",
"strength": "4",
"toughness": "4",
"wounds": "1",
"attacks": "1",
"leadership": "7",
"save": "3+"
},
"Aspiring Champion": {
"model_count": "1",
"points_value": "14",
"movement": "6\"",
"weapon_skill": "3+",
"ballistic_skill": "3+",
"strength": "4",
"toughness": "4",
"wounds": "1",
"attacks": "2",
"leadership": "8",
"save": "3+"
}
}
}
I would like to convert this to a data frame that looks like the below:我想将其转换为如下所示的数据框:
unit单元 | model模型 | model_count模型计数 | points_value点值 | movement移动 | weapon_skill武器技能 | ballistic_skill弹道技能 | strength力量 | toughness韧性 | wounds伤口 | attacks攻击 | leadership领导 | save节省 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Abaddon the Despoiler掠夺者亚巴顿 | Abaddon the Despoiler掠夺者亚巴顿 | 1 1 | 220\• 220\• | 6" 6" | 2+ 2+ | 2+ 2+ | 5 5 | 5 5 | 8 8 | 6 6 | 10 10 | +2 +2 |
Chaos Space Marines混沌星际战士 | Chaos Space Marines混沌星际战士 | 4-19 4-19 | 14 14 | 6" 6" | 3+ 3+ | 3+ 3+ | 4 4 | 4 4 | 1 1 | 1 1 | 7 7 | +3 +3 |
Chaos Space Marines混沌星际战士 | Aspiring Champion有抱负的冠军 | 1 1 | 14 14 | 6" 6" | 3+ 3+ | 3+ 3+ | 4 4 | 4 4 | 1 1 | 2 2 | 8 8 | +3 +3 |
@azro provided this useful answer to my question yesterday, but I asked the wrong question. @azro 昨天为我的问题提供了这个有用的答案,但我问错了问题。 In the original question, I wanted to skip the second level of keys so it looked like the below在最初的问题中,我想跳过第二级键,所以它看起来像下面这样
unit单元 | model_count模型计数 | points_value点值 | movement移动 | weapon_skill武器技能 | ballistic_skill弹道技能 | strength力量 | toughness韧性 | wounds伤口 | attacks攻击 | leadership领导 | save节省 |
---|---|---|---|---|---|---|---|---|---|---|---|
Abaddon the Despoiler掠夺者亚巴顿 | 1 1 | 220\• 220\• | 6" 6" | 2+ 2+ | 2+ 2+ | 5 5 | 5 5 | 8 8 | 6 6 | 10 10 | +2 +2 |
Chaos Lord混沌领主 | 1 1 | 80 80 | 6" 6" | 2+ 2+ | 2+ 2+ | 4 4 | 4 4 | 5 5 | 4 4 | 9 9 | +3 +3 |
d = {'Abaddon the Despoiler': {'Abaddon the Despoiler': {'model_count': '1', 'points_value': '220•', 'movement': '6"', 'weapon_skill': '2+', 'ballistic_skill': '2+', 'strength': '5', 'toughness': '5', 'wounds': '8', 'attacks': '6', 'leadership': '10', 'save': '2+'}},
'Chaos Lord': {'Chaos Lord': {'model_count': '1', 'points_value': '80','movement': '6"', 'weapon_skill': '2+', 'ballistic_skill': '2+', 'strength': '4', 'toughness': '4', 'wounds': '5', 'attacks': '4', 'leadership': '9', 'save': '3+'}}}
data = [{'unit': key, **values[key]} for key, values in d.items()]
nycphil = pd.DataFrame(data)
Use nested list comprehension with append dict to values of nested ditionaries and pass to DataFrame
constructor:使用嵌套列表理解,将 dict 附加到嵌套字典的值并传递给DataFrame
构造函数:
L = [{**{'unit': k, 'model': k1}, **v1} for k, v in d.items() for k1, v1 in v.items()]
df = pd.DataFrame(L)
print (df)
unit model model_count points_value \
0 Abaddon the Despoiler Abaddon the Despoiler 1 220•
1 Chaos Space Marines Chaos Space Marine 4-19 14
2 Chaos Space Marines Aspiring Champion 1 14
movement weapon_skill ballistic_skill strength toughness wounds attacks \
0 6" 2+ 2+ 5 5 8 6
1 6" 3+ 3+ 4 4 1 1
2 6" 3+ 3+ 4 4 1 2
leadership save
0 10 2+
1 7 3+
2 8 3+
EDIT: After some tests there was some nested values with error
, you can omit them and then ouput is:编辑:经过一些测试后,有一些带有error
嵌套值,您可以省略它们,然后输出为:
with open('chaos-space-marines.json') as f:
d = json.load(f)
L = []
for k, v in d.items():
if isinstance(v, dict):
for k1, v1 in v.items():
if isinstance(v1, dict):
L.append({**{'unit': k, 'model': k1}, **v1})
df = pd.DataFrame(L)
print (df)
unit model \
0 Abaddon the Despoiler Abaddon the Despoiler
1 Chaos Lord Chaos Lord
2 Chaos Lord in Terminator Armour Chaos Lord in Terminator Armour
3 Cypher Cypher
4 Daemon Prince Daemon Prince
.. ... ...
122 Hellforged Spartan Assault Tank Hellforged Spartan Assault Tank
123 Hellforged Typhon Heavy Siege Tank Hellforged Typhon Heavy Siege Tank
124 Kytan Ravager Kytan Ravager
125 Chaos Bastion Chaos Bastion
126 Noctilith Crown Noctilith Crown
model_count points_value movement weapon_skill ballistic_skill strength \
0 1 220• 6" 2+ 2+ 5
1 1 80 6" 2+ 2+ 4
2 1 95 5" 2+ 2+ 4
3 1 85• 7" 2+ 2+ 4
4 1 150 8" 2+ 2+ 7
.. ... ... ... ... ... ...
122 1 320 * * * 8
123 1 720 * * * 8
124 1 430 * 3+ 3+ *
125 1 150 - - 5+ -
126 1 85 - - 4+ -
toughness wounds attacks leadership save
0 5 8 6 10 2+
1 4 5 4 9 3+
2 4 6 4 9 2+
3 4 5 4 9 3+
4 6 8 4 10 3+
.. ... ... ... ... ...
122 8 20 4 9 2+
123 9 22 7 9 2+
124 8 22 * 9 3+
125 10 20 0 6 4+
126 8 14 - - 3+
[127 rows x 13 columns]
EDIT:编辑:
You can also check problematic values (why original solution failed) by else
statements:您还可以通过else
语句检查有问题的值(为什么原始解决方案失败):
with open('chaos-space-marines.json') as f:
d = json.load(f)
L = []
for k, v in d.items():
if isinstance(v, dict):
for k1, v1 in v.items():
if isinstance(v1, dict):
L.append({**{'unit': k, 'model': k1}, **v1})
else:
print ('inner loop')
print (v1)
else:
print ('outer loop')
print (v)
df = pd.DataFrame(L)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.