簡體   English   中英

使用文本數據旋轉熊貓數據框

[英]Unpivot pandas Dataframe with text data

由於我可以找到有關取消數據框引用數字數據的所有先前問題,因此我仍然沒有找到如何進行以下操作。

假設我有一個 Dataframe 設置如下:

+--------+--------+--------+-------+
| Level1 | Level2 | Level3 | Props |
+--------+--------+--------+-------+
| A      | A      | C      | X,Y   |
+--------+--------+--------+-------+
| A      | B      | C      | Y,Z   |
+--------+--------+--------+-------+
| D      | E      | F      | Y,Z   |
+--------+--------+--------+-------+
| G      | H      | I      | X,Z   |
+--------+--------+--------+-------+

我想得到:

+--------+--------+--------+---+---+---+
| Level1 | Level2 | Level3 | X | Y | Z |
+--------+--------+--------+---+---+---+
| A      | A      | C      | 1 | 1 | 0 |
+--------+--------+--------+---+---+---+
| A      | B      | C      | 0 | 1 | 1 |
+--------+--------+--------+---+---+---+
| D      | E      | F      | 0 | 1 | 1 |
+--------+--------+--------+---+---+---+
| G      | H      | I      | 1 | 0 | 1 |
+--------+--------+--------+---+---+---+

我怎么能這樣做?

謝謝!

R。

您可以使用pd.Series.str.get_dummies創建虛擬pd.Series.str.get_dummies並連接回源數據幀:

pd.concat((df.drop("Props", 1), df.Props.str.get_dummies(",")), axis=1)


 Level1 Level2  Level3  X   Y   Z
0   A      A       C    1   1   0
1   A      B       C    0   1   1
2   D      E       F    0   1   1
3   G      H       I    1   0   1

正如@BEN_YO 所建議的,您可以使用 join :

df.join(df.pop("Props").str.get_dummies(","))

嘗試這個:

import pandas as pd
  
#reading the csv
df = pd.read_csv('test.csv',delimiter='\t')

#making props column a list containing variables
df['props'] = df['props'].map(lambda x : x.split(','))

#getting dummies
df1 =pd.get_dummies(df.props.apply(pd.Series).stack()).sum(level=0)

#concatenating dummies df with original df and dropping 'props'
new_df = pd.concat([df.drop('props',1),df1],axis=1)
print(new_df)

或者

df['props'] = df['props'].map(lambda x : x.split(','))
new_df = pd.concat([df.drop('props',1),pd.get_dummies(df.props.apply(pd.Series).stack()).sum(level=0)],axis=1)
print(new_df)

輸入

level1  level2  level3  props
A       A       C       X,Y
A       B       C       Y,Z
D       D       F       Y,Z
G       G       I       X,Z

輸出

  level1 level2 level3  X  Y  Z
0      A      A      C  1  1  0
1      A      B      C  0  1  1
2      D      D      F  0  1  1
3      G      G      I  1  0  1
In [208]: df                                                                                                                                                                                                                                                                     
Out[208]: 
  level1 level2 level3   props  dummy
0      A      A      C  [X, Y]      1
1      A      B      C  [Y, Z]      1
2      D      E      F  [Y, Z]      1
3      G      H      I  [X, Z]      1

In [209]: df = pd.DataFrame({'level1': list('AADG'), 'level2': list("ABEH"), 'level3': list("CCFI"), 'props':[list("XY"), list("YZ"), list("YZ"), list("XZ")] })                                                                                                                 

In [210]: df                                                                                                                                                                                                                                                                     
Out[210]: 
  level1 level2 level3   props
0      A      A      C  [X, Y]
1      A      B      C  [Y, Z]
2      D      E      F  [Y, Z]
3      G      H      I  [X, Z]

In [211]: df['dummy'] = 1                                                                                                                                                                                                                                                        

In [212]: df[['level1', 'level2', 'level3']].join(df.explode('props').pivot(columns='props', values='dummy')).fillna(value=0)                                                                                                                                                    
Out[212]: 
  level1 level2 level3    X    Y    Z
0      A      A      C  1.0  1.0  0.0
1      A      B      C  0.0  1.0  1.0
2      D      E      F  0.0  1.0  1.0
3      G      H      I  1.0  0.0  1.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM