简体   繁体   中英

Split and transform columns into aggregated columns

The data I have is as described below:

Input:

df = pd.DataFrame({"col1": ["A1", "A2"], "col2": ["B1", "B2"], "2015-1": [231, 432], "2015-2": [456, 324]})
print(df)
  col1 col2  2015-1  2015-2
0   A1   B1     231     456
1   A2   B2     432     324

The columns 2015-1 and 2015-2 corresponds to years and months. The data I want to transform is as below:

Output:

print(df)
  col1 col2  year    month  values
0   A1   B1  2015        1     231
1   A1   B1  2015        2     456
2   A2   B2  2015        1     432
3   A2   B2  2015        2     324

I want to Transform the data into another dataframe without constructing a loop since the data I have contains lots of columns and rows, it takes a really long time. Is there any way to convert the input into output without a loop?

We can do melt with split then join back

s=df.melt(['col1','col2'])
s=s.join(s.variable.str.split('-',expand=True).rename(columns={0:'Year',1:'Month'}))
  col1 col2 variable  value  Year Month
0   A1   B1   2015-1    231  2015     1
1   A2   B2   2015-1    432  2015     1
2   A1   B1   2015-2    456  2015     2
3   A2   B2   2015-2    324  2015     2

Here's one approach using pd.melt and str.split :

output = df.melt(id_vars=['col1', 'col2'])
output.assign(**output.variable.str.split('-', expand=True)
                           .rename(columns={0:'year', 1:'month'}))

   col1 col2  year  value  month
0   A1   B1  2015    231       1
1   A2   B2  2015    432       1
2   A1   B1  2015    456       2
3   A2   B2  2015    324       2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM