[英]Filling a dataframe from a dictionary keys and values: efficient way
I have the following dataframe as an example.我以下面的 dataframe 为例。
df_test = pd.DataFrame(data=0, index=["green","yellow","red"], columns=["bear","dog","cat"])
I have the following dictionary with keys and values that are the same or related to the index and columns od my dataframe.我有以下字典,其中的键和值与我的 dataframe 的索引和列相同或相关。
d = {"green":["bear","dog"], "yellow":["bear"], "red":["bear"]}
I filled my dataframe according with the keys and values that are presented, using:我根据提供的键和值填充了我的 dataframe,使用:
for k, v in d.items():
for x in v:
df_test.loc[k, x] = 1
My problem here is that the dataframe and the dictionary I'm working with are very large and it took too much time to compute.我的问题是 dataframe 和我正在使用的字典非常大,计算时间太长。 Is there a more efficient way to do it?
有没有更有效的方法来做到这一点? Maybe iterating over rows in the dataframe instead of keys and values in the dictionary?
也许迭代 dataframe 中的行而不是字典中的键和值?
Because performance is important use MultiLabelBinarizer
:因为性能很重要,所以使用
MultiLabelBinarizer
:
d = {"green":["bear","dog"], "yellow":["bear"], "red":["bear"]}
from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
df = pd.DataFrame(mlb.fit_transform(list(d.values())),
columns=mlb.classes_,
index=list(d.keys()))
print (df)
bear dog
green 1 1
yellow 1 0
red 1 0
And then add missing columns and index labels by DataFrame.reindex
:然后通过
DataFrame.reindex
添加缺失的列和索引标签:
df_test = df.reindex(columns=df_test.columns, index=df_test.index, fill_value=0)
print (df_test)
bear dog cat
green 1 1 0
yellow 1 0 0
red 1 0 0
use get_dummies()
使用
get_dummies()
# convert dict to a Series
s = pd.Series(d)
# explode your list into columns and get dummies
df = pd.get_dummies(s.apply(pd.Series), prefix='', prefix_sep='')
bear dog
green 1 1
yellow 1 0
red 1 0
# convert dict to a Series
s = pd.Series(d)
# create a new data frame
df = pd.DataFrame(s.values.tolist(), index=s.index)
# get_dummies
new_df = pd.get_dummies(df, prefix='', prefix_sep='')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.