简体   繁体   English

使用列表、元组或字典作为单个条目值从字典创建 Pandas DataFrame

[英]Create Pandas DataFrame from dictionary with list, tuple or dict as a single entry value

I have a dictionary我有字典

data = { 'x' : 1,
         'y' : [1,2,3],
         'z' : (4,5,6),
         'w' : {1:2, 3:4}
       }

I'd like to construct a Pandas DataFrame such that the list and tuple do not get broadcasted:我想构建一个 Pandas DataFrame,这样列表和元组就不会被广播:

df = pd.DataFrame(some_transformation(data), index=['a'])

to get要得到

df = 
      x         y         z          w
a     1   (1,2,3)   (4,5,6)  (1,2,3,4)

Or some sort of flattening and/or string-fy of the list/tuple/dict.或者列表/元组/字典的某种扁平化和/或字符串化。 What is the easiest / most efficient way of doing so, without having to go down the exact data structure of each dictionary entry?这样做的最简单/最有效的方法是什么,而不必深入每个字典条目的确切数据结构?

without going down the exact data structure, I think the easiest way to achieve what you want is:无需深入了解确切的数据结构,我认为实现您想要的最简单方法是:

      data={k:str(v) for k,v in data}

Above statement will make all values as string type.以上语句将使所有值都为字符串类型。 Now you can convert the data dictionary to a dataframe by using below line:现在您可以使用以下行将数据字典转换为数据框:

    df=pd.DataFrame(data, index=[0])

This will get you the output in below form:这将为您提供以下形式的输出:

           w        x          y          z
    0 {1: 2, 3: 4}  1      [1, 2, 3]  (4, 5, 6)

Now for your desired output: (you can use other efficent methods as well for string replacement in dataframe)现在为您想要的输出:(您可以使用其他有效的方法以及在数据框中的字符串替换)

      for acol in df.columns:
            a[acol]=a[acol].values[0].strip('[{()}]')
            a[acol]=a[acol].values[0].replace(':', ',')

Output looks like输出看起来像

                 w         x        y          z

            1, 2, 3, 4     1    1, 2, 3     4, 5, 6

You cannot apply one transformation to lists/tuples and dictionaries.您不能对列表/元组和字典应用一种转换。 They have very different properties.它们具有非常不同的特性。 You can flatten all dictionaries and then create a pd.Series out of the updated dictionary.您可以展平所有字典,然后从更新的字典中创建一个pd.Series

for key in data:
    if isinstance(data[key],dict):
        data[key] = list(data[key].keys())+list(data[key].values())
pd.Series(data)
#w    [1, 3, 2, 4]
#x               1
#y       [1, 2, 3]
#z       (4, 5, 6)
#dtype: object

Convert it further into a DataFrame, if you want:如果需要,将其进一步转换为 DataFrame:

df = pd.DataFrame(pd.Series(data)).T
#              w  x          y          z
#0  [1, 3, 2, 4]  1  [1, 2, 3]  (4, 5, 6)

You can handle lists in the same spirit (convert them to tuples).您可以以相同的精神处理列表(将它们转换为元组)。

This is one way.这是一种方式。

def transformer(data):
    for k, v in data.items():
        if isinstance(v, list):
            data[k] = [tuple(v)]
        elif isinstance(v, dict):
            data[k] = [tuple(chain(*(v.items())))]
        else:
            data[k] = [v]
    return data

df = pd.DataFrame(transformer(data), index=['a'])

#               w  x          y          z
# a  (1, 2, 3, 4)  1  (1, 2, 3)  (4, 5, 6)

You can use set_value to assign those elements to the df and then transform dict and list to tuples.您可以使用 set_value 将这些元素分配给 df,然后将 dict 和 list 转换为元组。

df=pd.DataFrame(columns=data.keys())
[df.set_value(0,k,v) for k,v in data.items()]
df = df.applymap(lambda x: sum([[k,v] for k,v in x.items()],[]) if isinstance(x,dict) else x)
df = df.applymap(lambda x: tuple(x) if isinstance(x,list) else x)
Out[716]: 
   x          y          z             w
0  1  (1, 2, 3)  (4, 5, 6)  (1, 2, 3, 4)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM