Creating columns out of categorical variables with values from two other columns in pandas

Question

Original dataframe

I want it to transform into the following structure:

Area   |Ind3_2016|Ind6_2016|...|Ind12_2016|Ind3_2017|Ind6_2017|...| Ind12_2017
-------|---------|---------|---|----------|---------|---------|---|-----------
Alabama| 2306    | 2270    |...| 35621    | 2409    | 3391    |...| 36397

Create columns out of every unique value in IndCode column for 2016 and 2017 and place values of 2016 and 2017 column under these columns.

Answer 1

You can either perform two separate pivots and then concatenate the results, or do some stacking beforehand and just do one pivot.

Sample Data

import pandas as pd
df = pd.DataFrame({'Area': ['A', 'A','A','A','A'],
                   'IndCode': [3, 6, 10, 11, 12],
                   'Industry': ['blah', 'foo', 'bar', 'baz', 'boo'],
                   '2016': [2306, 2270, 5513, 7730, 35621],
                   '2017': [2409, 3391, 5438, 7890, 36397]  
})

Two pivots + Concat

pd.concat([pd.pivot_table(df, index='Area', columns='Ind'+df.IndCode.astype(str)+'_2016', values='2016'),
           pd.pivot_table(df, index='Area', columns='Ind'+df.IndCode.astype(str)+'_2017', values='2017')],axis=1)

Outputs:

IndCode  Ind10_2016  Ind11_2016  Ind12_2016  Ind3_2016  Ind6_2016  Ind10_2017  Ind11_2017  Ind12_2017  Ind3_2017  Ind6_2017
Area                                                                                                                       
A              5513        7730       35621       2306       2270        5438        7890       36397       2409       3391

Stacking before pivot

df2 = df.set_index(['Area', 'IndCode'])[['2016', '2017']].stack().reset_index()
pd.pivot_table(df2, index='Area', 
               columns='Ind'+df2.IndCode.astype('str')+'_'+df2.level_2.astype(str), 
               values=0).reset_index()

Outputs:

  Area  Ind10_2016  Ind10_2017  Ind11_2016  Ind11_2017  Ind12_2016  Ind12_2017  Ind3_2016  Ind3_2017  Ind6_2016  Ind6_2017
0    A        5513        5438        7730        7890       35621       36397       2306       2409       2270       3391

Creating columns out of categorical variables with values from two other columns in pandas

Question

1 answers

solution1
2 ACCPTED 2018-07-13 20:53:44

Sample Data

Two pivots + Concat

Stacking before pivot

Creating columns out of categorical variables with values from two other columns in pandas

Question

1 answers

solution1 2 ACCPTED 2018-07-13 20:53:44

Sample Data

Two pivots + Concat

Stacking before pivot

solution1
2 ACCPTED 2018-07-13 20:53:44