I've a data frame as below.Unsorted and variable number of characters.
+-------+
| Items |
+-------+
| X,Y,Z |
+-------+
| Z,X,Y |
+-------+
| Z,X |
+-------+
| Y |
+-------+
I want to split each entry by ,
and feed into respective columns. Number 1
if the value is available and 0
if not available.
My desired output as below
+-------+---+---+---+
| Items | X | Y | Z |
+-------+---+---+---+
| X,Y,Z | 1 | 1 | 1 |
+-------+---+---+---+
| Z,X,Y | 1 | 1 | 1 |
+-------+---+---+---+
| Z,X | 1 | 0 | 1 |
+-------+---+---+---+
| Y | 0 | 1 | 0 |
+-------+---+---+---+
I know how to split the dataframe by df['Items'].str.split(',')
. But feeding into respective column is the issue. coz' the items are unsorted. see row 1 and 2. same, but unsorted.
please guide me how should I approach to solve this.
We have pd.Series.str.get_dummies()
df=df.join(df.Items.str.get_dummies(','))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.