如何在pandas数据框中解开字典列？

Question

I have a dataframe of the following format我有以下格式的数据框

df = pd.DataFrame(
                 {"company":["McDonalds","Arbys","Wendys"],
                  "City":["Dallas","Austin","Chicago"],
                  "Datetime":[{"11/23/2016":"1","09/06/2011":"2"},
                              {"02/23/2012":"1","04/06/2013":"2"},
                              {"10/23/2017":"1","05/06/2019":"2"}]})
df
>>>    Company    City             Datetime
>>>    McDonalds  Dallas  {'11/23/2016': '1', '09/06/2011':'2'}
>>>    Arbys      Austin  {'02/23/2012': '1',  '04/06/2013':'2'}
>>>    Wendys     Chicago {'10/23/2017': '1',  '05/06/2019':'2'}

The dictionary inside of the column "Datetime" is a string , so I must read it into a python dictionary by using ast.literal_eval “Datetime”列中的字典是一个字符串，因此我必须使用 ast.literal_eval 将其读入 python 字典

I would like to unstack the dataframe based on the values in datetime so that the output looks as follows:我想根据日期时间中的值取消堆叠数据帧，以便输出如下所示：

df_out

>>>    Company    City    Date            Value
>>>    McDonalds  Dallas  11/23/2016      1
>>>    McDonalds  Dallas  09/06/2011      2
>>>    Arbys      Austin  02/23/2012      1
>>>    Arbys      Austin  04/06/2013      2
>>>    Wendys     Chicago 10/23/2017      1
>>>    Wendys     Chicago 05/06/2019      2

I am a bit lost on this one, I know I will need to iter over the rows and read each dictionary, so I had the idea of using df.iterrows() and creating namedTuples of each rows values that won't change, and then looping over the dictionary itself attaching different datetime values, but I am not sure this is the most efficient way.我对这个有点迷茫，我知道我需要遍历行并阅读每个字典，所以我有了使用df.iterrows()并创建每行值的 namedTuples 的想法，这些值不会改变，并且然后循环字典本身附加不同的日期时间值，但我不确定这是最有效的方法。 Any tips would be appreciated.任何提示将不胜感激。

Answer 1

My try:我的尝试：

(df.drop('Datetime', axis=1)
  .merge(df.Datetime.agg(lambda x: pd.Series(x))
           .stack().reset_index(-1),
         left_index=True, 
         right_index=True
        )
   .rename(columns={'level_1':'Date', 0:'Value'})
)

Output:输出：

     company     City        Date Value
0  McDonalds   Dallas  11/23/2016     1
0  McDonalds   Dallas  09/06/2011     2
1      Arbys   Austin  02/23/2012     1
1      Arbys   Austin  04/06/2013     2
2     Wendys  Chicago  10/23/2017     1
2     Wendys  Chicago  05/06/2019     2

Answer 2

I would flatten dictionaries in Datetime and construct a new df from it.我会在Datetime展平字典并从中构建一个新的df 。 Finally, join back.最后，重新加入。

from itertools import chain
df1 = pd.DataFrame(chain.from_iterable(df.Datetime.map(dict.items)), 
                   index=df.index.repeat(df.Datetime.str.len()), 
                   columns=['Date', 'Val'])

Out[551]:
         Date Val
0  11/23/2016   1
0  09/06/2011   2
1  02/23/2012   1
1  04/06/2013   2
2  10/23/2017   1
2  05/06/2019   2

df_final = df.drop('Datetime', 1).join(df1)

Out[554]:
     company     City        Date Val
0  McDonalds   Dallas  11/23/2016   1
0  McDonalds   Dallas  09/06/2011   2
1      Arbys   Austin  02/23/2012   1
1      Arbys   Austin  04/06/2013   2
2     Wendys  Chicago  10/23/2017   1
2     Wendys  Chicago  05/06/2019   2

Answer 3

Here is a clean solution:这是一个干净的解决方案：

Solution解决方案

df = df.set_index(['company', 'City'])
df_stack = (df['Datetime'].apply(pd.Series)
            .stack().reset_index()
           .rename(columns= {'level_2': 'Datetime', 0: 'val'}))

Output输出

print(df_stack.to_string())

     company     City    Datetime val
0  McDonalds   Dallas  11/23/2016   1
1  McDonalds   Dallas  09/06/2011   2
2      Arbys   Austin  02/23/2012   1
3      Arbys   Austin  04/06/2013   2
4     Wendys  Chicago  10/23/2017   1
5     Wendys  Chicago  05/06/2019   2

如何在pandas数据框中解开字典列？

问题描述

3 个解决方案

解决方案1
1 2019-12-06 20:22:08

解决方案2
1 2019-12-06 21:09:03

解决方案3
1 2019-12-06 22:06:36

如何在pandas数据框中解开字典列？

问题描述

3 个解决方案

解决方案1 1 2019-12-06 20:22:08

解决方案2 1 2019-12-06 21:09:03

解决方案3 1 2019-12-06 22:06:36

解决方案1
1 2019-12-06 20:22:08

解决方案2
1 2019-12-06 21:09:03

解决方案3
1 2019-12-06 22:06:36