简体   繁体   English

将嵌套列表的字典转换为pandas DataFrame

[英]Convert a dictionary of nested lists to a pandas DataFrame

I have a python dictionary as below: 我有一个python字典如下:

dict1={808: [['a', 5.4, 'b'],
  ['c', 4.1 , 'b'],
  ['d', 3.7 , 'f']]} 

I want to convert it into a data frame as below: 我想将其转换为数据框,如下所示:

memberid  userid score related
808       a      5.4     b
808       c      4.1     b
808       d      3.7     f

I tried with code below: 我试过下面的代码:

df=pd.DataFrame.from_dict(dict1,orient='index')

The results is not what I desired. 结果不是我想要的。

Could anybody know how to fix this? 有谁知道如何解决这个问题? Thanks! 谢谢!

Let's convert each nested list value to a DataFrame, and then call pd.concat . 让我们将每个嵌套列表值转换为DataFrame,然后调用pd.concat

columns = ['userid', 'score', 'related']

df_dict = {k : pd.DataFrame(v, columns=columns) for k, v in dict1.items()}

df = (pd.concat(df_dict)
        .reset_index(level=1, drop=True)
        .rename_axis('memberid')
        .reset_index()
)

Or, in similar fashion— 或者,以类似的方式 -

df = pd.concat([
       pd.DataFrame(v, columns=columns, index=np.repeat(k, len(v))) 
       for k, v in dict1.items()
  ]
).rename_axis('memberid').reset_index()

df

   memberid userid  score related
0       808      a    5.4       b
1       808      c    4.1       b
2       808      d    3.7       f 

Important note—this solution also works for multiple key-value pairs, where each key may not have the same number of lists. 重要说明 - 此解决方案适用于多个键值对,其中每个键可能没有相同数量的列表。 But because of this flexibility, it may become slow for large DataFrames. 但由于这种灵活性,大型DataFrame可能会变慢。 In that case, the modified solution below works if dict1 contains just one entry— 在这种情况下,如果dict1 包含一个 dict1 ,则下面的修改后的解决方案有效 -

k, v = list(dict1.items())[0]
pd.DataFrame(v, columns=columns, index=np.repeat(k, len(v))).reset_index()

   index userid  score related
0    808      a    5.4       b
1    808      c    4.1       b
2    808      d    3.7       f

Using pd.Series couple of times 使用pd.Series几次

df=pd.Series(dict1).apply(pd.Series).stack().apply(pd.Series).reset_index().drop('level_1',1)
df.columns=['memberid','userid', 'score', 'related']
df
Out[626]: 
   memberid userid  score related
0       808      a    5.4       b
1       808      c    4.1       b
2       808      d    3.7       f

Feeding your dictionary values into pd.DataFrame is one way. 将字典值输入pd.DataFrame是一种方法。

Here we use next(iter(some_view)) syntax to extract the only key and only value. 这里我们使用next(iter(some_view))语法来提取唯一的键和唯一的值。

This is an efficient solution where you can guarantee your dictionary only has one key and the value is a list of lists. 这是一个有效的解决方案,您可以保证您的字典只有一个键,值是列表。

df = pd.DataFrame(next(iter(dict1.values())), columns=['userid', 'score', 'related'])\
       .assign(memberid=next(iter(dict1.keys())))

print(df)

  userid  score related  memberid
0      a    5.4       b       808
1      c    4.1       b       808
2      d    3.7       f       808

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM