[英]Convert pandas df from long to wide and then into a sparse matrix
我有這個數據集:
ARTID INFO_1 INFO_2
00001 some_info_11 some_info_21
00002 some_info_12 some_info_22
00003 some_info_13 some_info_23
我想變成這樣
ARTID some_info_11 some_info_12 some_info_13 some_info_21 some_info_22 some_info_23
00001 1 0 0 1 0 0
00002 0 1 0 0 1 0
但我需要是一個稀疏矩陣。 什么是最“記憶友好”的方式來做到這一點?
使用pd.get_dummies()
和pd.concat()
df1 = pd.concat([df.ARTID,pd.get_dummies(df[['INFO_1','INFO_2']],prefix='',prefix_sep='')],axis=1)
print(df1)
ARTID some_info_11 some_info_12 some_info_13 some_info_21 \
0 00001 1 0 0 1
1 00002 0 1 0 0
2 00003 0 0 1 0
some_info_22 some_info_23
0 0 0
1 1 0
2 0 1
如果允許將ARTID
作為索引,則可以使用:
pd.get_dummies(df[['INFO_1','INFO_2']],prefix='',prefix_sep='').set_index(df.ARTID)
some_info_11 some_info_12 some_info_13 some_info_21 some_info_22 \
ARTID
00001 1 0 0 1 0
00002 0 1 0 0 1
00003 0 0 1 0 0
some_info_23
ARTID
00001 0
00002 0
00003 1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.