Fastest way to convert a list of dictionaries (each having multiple sub-dictionaries) into a single dataframe

Question

I currently have a list of dictionaries shown below:

temp_indices_=[{0: {12:11,11:12}}, {0: {14:13,13:14}}, {0: {16:15,15:16}}, {0: {20:19,19:20}},{0: {24: 23, 23: 24, 22: 24}, 1: {24: 22, 23: 22, 22: 23}},{0: {28: 27, 27: 28, 26: 28}, 1: {28: 26, 27: 26, 26: 27}}]

To convert the list into a dataframe, the following code is called:

  temp_indices= pd.DataFrame()
  
  for ind in range(len(temp_indices_)):
       # print(ind)
        temp_indices = pd.concat([temp_indices,pd.DataFrame(temp_indices_[ind][0].items())],axis=0)
  temp_indices = temp_indices.rename(columns={0:'ind',1:'label_ind'})

An example output from temp_indices is shown below which should concat all dictionaries into one dataframe:

   ind  label_ind
0   12  11
1   11  12
0   14  13
1   13  14
0   16  15
1   15  16
0   20  19
1   19  20
0   24  23
1   23  24
2   22  24
0   28  27
1   27  28
2   26  28
0   28  26 
1   27  26  
2   26 27

To improve speed I have tried out pd.Series(temp_indices_).explode().reset_index() as well as pd.DataFrame(map(lambda i: pd.DataFrame(i[0].items()), temp_indices_)) but can not drill down to the core dictionary to convert it to a dataframe.

Answer 1

Use list comprehension for speedup:

Three loops have been used inside list comprehension . One for iterating over the list of dictionaries. Second for accessing values from dictionary. And thired for accessing key,value pair along with increasing index.
Then make dataframe from resultant list.
Since column named 'label' contains tuple of values so break it using df['label'].tolist()
Finally delete the column named 'label'

data = [(ind,list(value.items())[ind]) for i in temp_indices_ for value in i.values() for ind in range(len(value))]
df = pd.DataFrame(data, columns =["Index","label"])
df[['ind', 'label_ind']] = pd.DataFrame(df['label'].tolist(), index=df.index)
df.drop(['label'], axis=1, inplace=True)
print(df)

        Index  ind  label_ind
    0       0   12         11
    1       1   11         12
    2       0   14         13
    3       1   13         14
    4       0   16         15
    5       1   15         16
    6       0   20         19
    7       1   19         20
    8       0   24         23
    9       1   23         24
    10      2   22         24
    11      0   24         22
    12      1   23         22
    13      2   22         23
    14      0   28         27
    15      1   27         28
    16      2   26         28
    17      0   28         26
    18      1   27         26
    19      2   26         27

Answer 2

This just sounds like a problem that can be solved through recursion with the final output being used to create a DataFrame .

def unpacker(data, parent_idx=None):
    final = []
    
    if isinstance(data, list):
        for row in data:
            for k, v in row.items():
                if isinstance(v, dict):
                    unpacked = unpacker(v, parent_idx=k)
                    for row1 in unpacked:
                        final.append(row1)
    else:
        for k1, v1 in data.items():
            final.append((parent_idx, k1, v1))
    
    return final

l = unpacker(temp_indices_)
df = pd.DataFrame(l, columns=["Index", "Ind", "Label_Ind"])
print(df)

    Index  Ind  Label_Ind
0       0   12         11
1       0   11         12
2       0   14         13
3       0   13         14
4       0   16         15
5       0   15         16
6       0   20         19
7       0   19         20
8       0   24         23
9       0   23         24
10      0   22         24
11      1   24         22
12      1   23         22
13      1   22         23
14      0   28         27
15      0   27         28
16      0   26         28
17      1   28         26
18      1   27         26
19      1   26         27

Fastest way to convert a list of dictionaries (each having multiple sub-dictionaries) into a single dataframe

Question

2 answers

solution1
1 ACCPTED 2021-05-04 12:21:39

solution2
0 2021-05-04 12:11:38

Fastest way to convert a list of dictionaries (each having multiple sub-dictionaries) into a single dataframe

Question

2 answers

solution1 1 ACCPTED 2021-05-04 12:21:39

solution2 0 2021-05-04 12:11:38

solution1
1 ACCPTED 2021-05-04 12:21:39

solution2
0 2021-05-04 12:11:38