简体   繁体   English

使用字典键之外的索引创建熊猫数据框

[英]Creating a pandas dataframe with indexes out of dictionary keys

I have a dictionary like this:我有一本这样的字典:

my_dict

{'metric_1-metric2': [0.062034245713139154,
  0.7711299537484807,
  0.9999999999999999,
  ['US', 'mobile', 'google'],
  ['UK', 'desktop', 'facebook']],
 'metric_1-metric_3': [-0.9607689228305227,
  -0.12803370313903312,
  0.778375882191523,
  ['CAN', 'tablet', 'google'],
  ['UK', 'desktop', 'yahoo']],
 'metric_1-metric_4': [-0.4678967355247944,
  0.6600255030070277,
  0.9999999999999999,
  ['PT', 'desktop', 'gmail'],
  ['UK', 'desktop', 'apple']]}

I am trying to achieve the following result:我正在努力实现以下结果:

df

A               B                   C       D       E       F                               G
metric_1        metric_2            0.062   0.771   0.999   ['US', 'mobile', 'google']      ['UK', 'desktop', 'facebook']
metric_1        metric_3            -0.960  -0.128  0.778   ['CAN', 'tablet', 'google']     ['UK', 'desktop', 'yahoo']
metric_1        metric_4            -0.467  0.660   0.999   ['PT', 'desktop', 'gmail']      ['UK', 'desktop', 'apple']

It's clear that I'll split up the names of the key in my_dict :很明显,我将在my_dict拆分key的名称:

index_names = []
column_names = []

for x in my_dict.keys():
    index_names.append(x.split('-')[0])
    column_names.append(x.split('-')[1])

How could I create such a structure in a pandas dataframe?我怎么能在熊猫数据框中创建这样的结构?

It's just like commented, with a little extra:就像评论一样,有一点额外:

df = pd.DataFrame(my_dict).T

(df.index.to_series()             # get the metrics from index
   .str.split('-', expand=True)   # split by `-`
   .rename(columns={0:'A',1:'B'}) # rename the metric
   .join(df)                      # join as usual
   .reset_index(drop=True)        # remove the metric in index
)

Output:输出:

    A         B                  0          1         2  3                            4
--  --------  --------  ----------  ---------  --------  ---------------------------  -----------------------------
 0  metric_1  metric2    0.0620342   0.77113   1         ['US', 'mobile', 'google']   ['UK', 'desktop', 'facebook']
 1  metric_1  metric_3  -0.960769   -0.128034  0.778376  ['CAN', 'tablet', 'google']  ['UK', 'desktop', 'yahoo']
 2  metric_1  metric_4  -0.467897    0.660026  1         ['PT', 'desktop', 'gmail']   ['UK', 'desktop', 'apple']

Check from_dict ,then split the index with reset_index at the end检查from_dict ,然后在末尾用reset_index split索引

s = pd.DataFrame.from_dict(d,'index')
s.index=pd.MultiIndex.from_tuples(s.index.str.split('-').map(tuple))
s.reset_index(inplace=True)
s
Out[210]: 
    level_0   level_1  ...                      3                        4
0  metric_1   metric2  ...   [US, mobile, google]  [UK, desktop, facebook]
1  metric_1  metric_3  ...  [CAN, tablet, google]     [UK, desktop, yahoo]
2  metric_1  metric_4  ...   [PT, desktop, gmail]     [UK, desktop, apple]
[3 rows x 7 columns]

More simple and intuitive way to get the exact results (but less pandas focused)获得准确结果的更简单直观的方法(但较少关注熊猫)

a = {}

a = {"A": [], "B": [], "C": [], "D": [], "E": [], "F": [], "G": []}

for key, value in d.items():
    key = key.split('-')
    a['A'].append(key[0])
    a['B'].append(key[1])
    a['C'].append(value[0])
    a['D'].append(value[1])
    a['E'].append(value[2])
    a['F'].append(value[3])
    a['G'].append(value[4])

df = pd.DataFrame(data = a)

d is the original dict in the question. d是问题中的原始字典。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM