How to create a new dichotomized columns from values in an existing column using pandas

Question

I have a dataframe that looks like this:

ID       type       period
1        2          3
1        2          3
1        3          3
2        2          3
2        3          2
2        3          2
3        2          2

There are a total of X types and X periods. Not all types/periods will be used, but I need columns to be created for all X of each just so that the table doesn't break in the database when imported from pandas. (Assume X in this example is 3, but it's really 9, just shortened in this example.)

For each ID, I need a 0 to show if that type/period was present, and a 1 to show if it was not.

The desired dataframe looks like this:

ID   type_1   type_2   type_3   period_1   period_2   period_3
1    0        1        1        0          0          1
2    0        1        1        0          1          1
3    0        1        0        0          1          0

Any advice towards the right direction would be greatly appreciated! Thank you!

Answer 1

From your DataFrame :

>>> import pandas as pd
>>> from io import StringIO

>>> df = pd.read_csv(StringIO("""
ID       type       period
1        2          3
1        2          3
1        3          3
2        2          3
2        3          2
2        3          2
3        2          2"""), sep='       ')
>>> df
    ID  type    period
0   1   2       3
1   1   2       3
2   1   3       3
3   2   2       3
4   2   3       2
5   2   3       2
6   3   2       2

We can use groupby on columns 'ID' and 'type' to extract their size , then unstack the result, fill NaNs with zeros and finally convert it to bool and int as you want 0 and 1 values :

>>> df.groupby(['ID','type']).size().unstack(fill_value=0).astype(bool).astype(int)
type    2   3
ID      
1       1   1
2       1   1
3       1   0

And for the period column :

>>> df.groupby(['ID','period']).size().unstack(fill_value=0).astype(bool).astype(int)
period  2   3
ID      
1       0   1
2       1   1
3       1   0

How to create a new dichotomized columns from values in an existing column using pandas

Question

1 answers

solution1
1 ACCPTED 2021-07-30 08:07:03

How to create a new dichotomized columns from values in an existing column using pandas

Question

1 answers

solution1 1 ACCPTED 2021-07-30 08:07:03

solution1
1 ACCPTED 2021-07-30 08:07:03