[英]How can I manipulate python list and convert it to pandas dataframe?
I want to create a pandas dataframe.我想创建一个熊猫数据框。 I have the following data list received from a program:
我从程序中收到以下数据列表:
rawdatalist = [
{
'Project_Name':'App1',
'Run Id':'25',
'cpu':[{'Server1':(21.62,65.16)},{'Server2':(18.0,60.43)}]
},
{
'Project_Name':'App1',
'Run Id':'24',
'cpu':[{'Server1':(17.91, 57.81)},{'Server2':(21.33, 61.43)},{'Server3':(2.96, 6.59)}]
},
{
'Project_Name':'App2',
'Run Id':'25',
'cpu':[{'Server1':(17.01, 41.28)},{'Server2':(23.56, 68.13)}]
},
{
'Project_Name':'App2',
'Run Id':'24',
'cpu':[{'Server1':(22.23, 45.47)},{'Server2':(18.65, 48.95)},{'Server3':(1.62, 2.86)},{'Server4':(1.59, 4.19)}]
}
]
1st dataframe with first values of dictionary具有字典第一个值的第一个数据框
cpu run id 25 run id 24 run id 25 run id 24
Server1 21.62 17.91 17.01 22.23
Server2 18.0 21.33 23.56 18.65
Server3 None 2.96 None 1.62
Server4 None None None 1.59
2nd dataframe with second values of dictionary具有字典第二个值的第二个数据框
cpu run id 25 run id 24 run id 25 run id 24
Server1 65.16 57.81 41.28 45.47
Server2 60.43 61.43 68.13 48.95
Server3 None 6.59 None 2.86
Server4 None None None 4.19
I think there is certainly an easier way to solve the problem and I am looking forward to further answers.我认为肯定有更简单的方法来解决这个问题,我期待着进一步的答案。 Until then, my approach:
在那之前,我的方法是:
import pandas as pd
from collections import ChainMap
# store name of the keys
key_name = 'Project_Name'
key_id = 'Run Id'
key_cpu = 'cpu'
# store all possible names and project ids
names = set()
run_ids = set()
for data in rawdatalist:
names.add(data.get(key_name))
run_ids.add(data.get(key_id))
# create multi index
index = pd.MultiIndex.from_product([names, run_ids], names=[key_name, key_id])
# initialize both data frames
first_df = pd.DataFrame(index=index)
second_df = pd.DataFrame(index=index)
for data in rawdatalist:
# store the value for each key
project_name = data[key_name]
run_id = data[key_id]
cpus = data[key_cpu]
# merge the list of dicts to one
row = dict(ChainMap(*cpus))
keys = list(row.keys())
values = list(row.values())
# store the first part of tuple and set in first data frame
first_value = [x[0] for x in values]
first_df.loc[(project_name, run_id), keys] = first_value
# store the second part of tuple and set in second data frame
second_value = [x[1] for x in values]
second_df.loc[(project_name, run_id), keys] = second_value
# transpose
first_df = first_df.T
second_df = second_df.T
Output:输出:
Project_Name App1 App2
Run Id 25 24 25 24
Server2 18.00 21.33 23.56 18.65
Server1 21.62 17.91 17.01 22.23
Server3 NaN 2.96 NaN 1.62
Server4 NaN NaN NaN 1.59
Project_Name App1 App2
Run Id 25 24 25 24
Server2 60.43 61.43 68.13 48.95
Server1 65.16 57.81 41.28 45.47
Server3 NaN 6.59 NaN 2.86
Server4 NaN NaN NaN 4.19
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.