How to unpivot pandas dataframe

Question

I have a Pandas dataframe which looks as follows: Starting Table:

Kode	Country	Procedure	male	male	female	female
Kode	Country	Procedure	two-year-old	three-year-old	two-year-old	three-year-old
1a	US	proc_1	4	6	3	6
1a	UK	proc_1	2	3	5	1
1b	US	proc_2	15	3	5	2
1b	UK	proc_2	8	4	7	3

CSV:

Code;Country;Procedure;male;male;female;female
Code;Country;Procedure;two-year-old;three-year-old;two-year-old;three-year-old
1a;US;proc_1;4;6;3;6
1a;UK;proc_1;2;3;5;1
1b;US;proc_2;15;3;5;2
1b;UK;proc_2;8;4;7;3

My python code:

df = pd.read_csv('test.csv', 
                 header=[0,1],delimiter= ';',
                 engine='python', 
                 encoding= 'unicode_escape')

In fact I have two row headers How to pivot pandas dataframe to get the following result?

Resulting Table:

Code	Country	Procedure	Gender	Age	Frequency
1a	US	proc_1	male	two-year-old	4
1a	US	proc_1	male	three-year-old	6
1a	US	proc_1	female	two-year-old	3
1a	US	proc_1	female	three-year-old	6
1a	UK	proc_1	male	two-year-old	2
1a	UK	proc_1	male	three-year-old	3
1a	UK	proc_1	female	two-year-old	5
1a	UK	proc_1	female	three-year-old	1
1b	...

Answer 1

You'll likely want to go with melt which is sort of the opposite of pivot:

import io
import pandas as pd

piv = pd.read_csv(io.StringIO("""Code;Country;Procedure;male;male;female;female
Code;Country;Procedure;two-year-old;three-year-old;two-year-old;three-year-old
1a;US;proc_1;4;6;3;6
1a;UK;proc_1;2;3;5;1
1b;US;proc_2;15;3;5;2
1b;UK;proc_2;8;4;7;3"""), header=[0, 1], delimiter=";")


unpivoted = piv.melt(id_vars=list(piv.columns[:3]),
                     var_name=['gender', 'age'],
                     value_name='frequency')

# cleanup the tupleized columns
unpivoted.columns = [(col[0] if isinstance(col, tuple) else col)
                     for col in unpivoted.columns]

Answer 2

Using "melt" with "pivot_table", you can try the following:

df_melt=df.melt(id_vars=['Kode','Country','Procedure'],var_name='Gender')

Extract two-year-old, three-year-old into its own column "Age_group", and remove non-numeric rows from the "value" column"

df_melt['Age_group']=df_melt['value'].str.extract('(.*-year-old)')
df_melt['Age_group']=df_melt['Age_group'].ffill()
df_melt=df_melt[pd.to_numeric(df_melt['value'],errors='coerce').notnull()]

If the original data frame is red into an editor as is, the columns named the same would be renamed as male, male.1, female, female.1, etc. To remove characters after "." from the "Gender" column

df_melt['Gender']=df_melt['Gender'].map(lambda x: x[0: x.find('.')] if '.' in x else x)

Pivot result

result=df_melt.pivot_table(index=['Kode','Country','Procedure','Gender','Age_group'],values='value')

How to unpivot pandas dataframe

Question

1 answers

solution1
2 2021-11-05 13:43:59

solution2
0 2021-11-05 14:46:43

How to unpivot pandas dataframe

Question

1 answers

solution1 2 2021-11-05 13:43:59

solution2 0 2021-11-05 14:46:43

solution1
2 2021-11-05 13:43:59

solution2
0 2021-11-05 14:46:43