I have a dictionary like this:
pred_dict = {('african zebra', 'arabian horse'): [('Blue Whale', 0.49859235), ('Ferrari', 0.5013809), ('african zebra', 0.49264234), ('ara
...: bian horse', 0.5186422), ('bobcat', 0.5096679)], ('cheetah', 'mountain lion'): [('Blue Whale', 0.48881102), ('Ferrari', 0.502793), ('afric
...: an zebra', 0.48751196), ('arabian horse', 0.49272105), ('bobcat', 0.5228181)]}
Would like to convert to a dataframe like this:
Text | Blue Whale | Ferrari | african zebra| arabian horse | bobcat |
('african zebra', 'arabian horse') 0.49859235 0.5013809 0.49264234 0.5186422 0.5096679
('cheetah', 'mountain lion') 0.48881102 0.502793 0.48751196 0.49272105 0.5228181
The each value in the given dictionary has the exact same number of tuples with identical first-values in the tuple list. What is to be done is to place the keys of the dict in the 'text' column, and then, have the first values in the tuples as other column-names. The values will be the scores - floats.
Any suggestions would be helpful. here is some stuff I am trying now:
In [12]: text = list(pred_dict.keys())
In [13]: values = list(pred_dict.values())
In [14]: pred_df = pd.DataFrame({'text': text, 'label_scores': values})
In [15]: pred_df
Out[15]:
text label_scores
0 (african zebra, arabian horse) [(Blue Whale, 0.49859235), (Ferrari, 0.5013809...
1 (cheetah, mountain lion) [(Blue Whale, 0.48881102), (Ferrari, 0.502793)...
In [19]: df_scores = pred_df['label_scores']
In [21]: df_scores
Out[21]:
0 [(Blue Whale, 0.49859235), (Ferrari, 0.5013809...
1 [(Blue Whale, 0.48881102), (Ferrari, 0.502793)...
Name: label_scores, dtype: object
In [22]: labels = [t[1] for t in df_scores[0]]
In [23]: labels
Out[23]: [0.49859235, 0.5013809, 0.49264234, 0.5186422, 0.5096679]
In [24]: labels = [t[0] for t in df_scores[0]]
In [25]: labels
Out[25]: ['Blue Whale', 'Ferrari', 'african zebra', 'arabian horse', 'bobcat']
In [26]: scores = [t[1] for t in df_scores[0]]
In [27]: scores
Out[27]: [0.49859235, 0.5013809, 0.49264234, 0.5186422, 0.5096679]
In [28]: scores = [t[1] for t in df_scores[1]]
In [29]: scores
Out[29]: [0.48881102, 0.502793, 0.48751196, 0.49272105, 0.5228181]
pred_dict = {('african zebra', 'arabian horse'): [('Blue Whale', 0.49859235), ('Ferrari', 0.5013809), ('african zebra', 0.49264234), ('arabian horse', 0.5186422), ('bobcat', 0.5096679)], ('cheetah', 'mountain lion'): [('Blue Whale', 0.48881102), ('Ferrari', 0.502793), ('african zebra', 0.48751196), ('arabian horse', 0.49272105), ('bobcat', 0.5228181)]}
This should do it:
pd.concat([pd.DataFrame(r,columns=['Text','value'],index=[t]*len(r)) for (t, r) in pred_dict.items()]).set_index('Text',append=True).unstack('Text')['value']
It's not pretty but it works:
pred_dict = {
('african zebra', 'arabian horse'): [('Blue Whale', 0.49859235),
('Ferrari', 0.5013809),
('african zebra', 0.49264234),
('arabian horse', 0.5186422),
('bobcat', 0.5096679)],
('cheetah', 'mountain lion'): [('Blue Whale', 0.48881102),
('Ferrari', 0.502793),
('african zebra', 0.48751196),
('arabian horse', 0.49272105),
('bobcat', 0.5228181)]
}
df = pd.DataFrame(pred_dict).T
df.columns = [tuple[0] for tuple in list(df.iloc[0])]
df = df.apply(lambda x: [tuple[1] for tuple in x])
df.reset_index(inplace=True)
df.insert(0, "Text", list(zip(df.level_0, df.level_1)))
df.drop(["level_0", "level_1"], axis=1, inplace=True)
The output of which is:
Text Blue Whale ... arabian horse bobcat
0 (african zebra, arabian horse) 0.498592 ... 0.518642 0.509668
1 (cheetah, mountain lion) 0.488811 ... 0.492721 0.522818
ok. After some trials, was able to do it. here is how I did it:
text = list(pred_dict.keys())
values = list(pred_dict.values())
df_1 = pd.DataFrame({'text': text})
score_dict = {}
for label in mlb_classes:
score_list = []
for t_list in values:
for t in t_list:
if t[0] == label:
score_list.append(t[1])
score_dict[label] = score_list
df_2 = pd.DataFrame(score_dict)
score_df = pd.concat([df_1, df_2], axis=1)
print(score_df)
Output:
text Blue Whale Ferrari african zebra arabian horse bobcat
0 (african zebra, arabian horse) 0.519343 0.511951 0.512639 0.527919 0.491461 0.516240
1 (cheetah, mountain lion) 0.495197 0.527627 0.497516 0.512571 0.488823 0.510277
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.