I currently have a list of dictionaries shown below:
temp_indices_=[{0: {12:11,11:12}}, {0: {14:13,13:14}}, {0: {16:15,15:16}}, {0: {20:19,19:20}},{0: {24: 23, 23: 24, 22: 24}, 1: {24: 22, 23: 22, 22: 23}},{0: {28: 27, 27: 28, 26: 28}, 1: {28: 26, 27: 26, 26: 27}}]
To convert the list into a dataframe, the following code is called:
temp_indices= pd.DataFrame()
for ind in range(len(temp_indices_)):
# print(ind)
temp_indices = pd.concat([temp_indices,pd.DataFrame(temp_indices_[ind][0].items())],axis=0)
temp_indices = temp_indices.rename(columns={0:'ind',1:'label_ind'})
An example output from temp_indices is shown below which should concat all dictionaries into one dataframe:
ind label_ind
0 12 11
1 11 12
0 14 13
1 13 14
0 16 15
1 15 16
0 20 19
1 19 20
0 24 23
1 23 24
2 22 24
0 28 27
1 27 28
2 26 28
0 28 26
1 27 26
2 26 27
To improve speed I have tried out pd.Series(temp_indices_).explode().reset_index()
as well as pd.DataFrame(map(lambda i: pd.DataFrame(i[0].items()), temp_indices_))
but can not drill down to the core dictionary to convert it to a dataframe.
Use list comprehension
for speedup:
list comprehension
. One for iterating over the list of dictionaries. Second for accessing values from dictionary. And thired for accessing key,value pair along with increasing index.df['label'].tolist()
data = [(ind,list(value.items())[ind]) for i in temp_indices_ for value in i.values() for ind in range(len(value))]
df = pd.DataFrame(data, columns =["Index","label"])
df[['ind', 'label_ind']] = pd.DataFrame(df['label'].tolist(), index=df.index)
df.drop(['label'], axis=1, inplace=True)
print(df)
Index ind label_ind
0 0 12 11
1 1 11 12
2 0 14 13
3 1 13 14
4 0 16 15
5 1 15 16
6 0 20 19
7 1 19 20
8 0 24 23
9 1 23 24
10 2 22 24
11 0 24 22
12 1 23 22
13 2 22 23
14 0 28 27
15 1 27 28
16 2 26 28
17 0 28 26
18 1 27 26
19 2 26 27
This just sounds like a problem that can be solved through recursion with the final output being used to create a DataFrame
.
def unpacker(data, parent_idx=None):
final = []
if isinstance(data, list):
for row in data:
for k, v in row.items():
if isinstance(v, dict):
unpacked = unpacker(v, parent_idx=k)
for row1 in unpacked:
final.append(row1)
else:
for k1, v1 in data.items():
final.append((parent_idx, k1, v1))
return final
l = unpacker(temp_indices_)
df = pd.DataFrame(l, columns=["Index", "Ind", "Label_Ind"])
print(df)
Index Ind Label_Ind
0 0 12 11
1 0 11 12
2 0 14 13
3 0 13 14
4 0 16 15
5 0 15 16
6 0 20 19
7 0 19 20
8 0 24 23
9 0 23 24
10 0 22 24
11 1 24 22
12 1 23 22
13 1 22 23
14 0 28 27
15 0 27 28
16 0 26 28
17 1 28 26
18 1 27 26
19 1 26 27
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.