簡體   English   中英

將pandas df從長轉換為寬然后轉換為稀疏矩陣

[英]Convert pandas df from long to wide and then into a sparse matrix

我有這個數據集:

ARTID    INFO_1         INFO_2 
00001   some_info_11   some_info_21
00002   some_info_12   some_info_22
00003   some_info_13   some_info_23

我想變成這樣

ARTID    some_info_11  some_info_12   some_info_13   some_info_21   some_info_22 some_info_23
00001      1                 0           0              1                0             0
00002      0                 1           0              0                1             0

但我需要是一個稀疏矩陣。 什么是最“記憶友好”的方式來做到這一點?

使用pd.get_dummies()pd.concat()

df1 = pd.concat([df.ARTID,pd.get_dummies(df[['INFO_1','INFO_2']],prefix='',prefix_sep='')],axis=1)

print(df1)
  ARTID  some_info_11  some_info_12  some_info_13  some_info_21  \
0 00001             1             0             0             1   
1 00002             0             1             0             0   
2 00003             0             0             1             0   

   some_info_22  some_info_23  
0             0             0  
1             1             0  
2             0             1  

如果允許將ARTID作為索引,則可以使用:

pd.get_dummies(df[['INFO_1','INFO_2']],prefix='',prefix_sep='').set_index(df.ARTID)

             some_info_11  some_info_12  some_info_13  some_info_21  some_info_22  \
ARTID                                                                         
00001                 1             0             0             1             0   
00002                 0             1             0             0             1   
00003                 0             0             1             0             0   

          some_info_23  
ARTID                
00001                 0  
00002                 0  
00003                 1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM