I have a dataframe that has unique identifier in one column and is in long format. My goal is to have one user_id(student) per row and to pivot so that the structure is wide.
Current dataframe example:
user_id test_type test_date
0 1 ACT 2013-08-15
1 2 ACT 2011-12-09
2 3 SAT 2012-03-09
3 4 ACT 2003-07-27
4 4 SAT 2013-12-31
The problem is that some students have taken both tests so I want to ultimately have one column for ACT, one column for SAT, and a column each for the corresponding date.
Desired Format:
user_id test_ACT ACT_date test_SAT SAT_date
0 1 ACT 2013-08-15 NaN NaN
1 2 ACT 2011-12-09 NaN NaN
2 3 NaN NaN SAT 2012-03-09
3 4 ACT 2003-07-27 SAT 2013-12-31
I have tried to groupby and pivot:
df['idx'] = df.groupby('user_id').cumcount()
tmp = []
for var in ['test_type','test_date']:
procedure_sct['tmp_idx'] = var + '_' + df.idx.astype(str)
tmp.append(df.pivot(index='user_id',columns='tmp_idx',values=var))
df_wide = pd.concat(tmp,axis=1).reset_index()
This means that the format is wide but not separated by test type.
Output from attempt but not desired:
user_id test_type_0 test_date_0 test_type_1 test_date_1
0 1 ACT 2013-08-15 NaN NaN
1 2 ACT 2011-12-09 NaN NaN
2 3 SAT 2012-03-09 NaN NaN
3 4 ACT 2003-07-27 SAT 2013-12-31
After trying provided answer:
index user_id ACT_date test_ACT user_id SAT_date test_SAT
0 0 1.0 2013-08-15 ACT NaN NaN NaN
1 1 2.0 2011-12-09 ACT NaN NaN NaN
2 2 NaN NaN NaN 3.0 2012-03-09 SAT
3 3 4.0 2003-07-27 ACT NaN NaN NaN
4 4 NaN NaN NaN 4.0 2013-12-31 SAT
This should work:
df1=df[df.test_type=='ACT'].set_index(user_id)
df1.columns = ['ACT_date']
df1["test_ACT"]="ACT"
df2=df[dft.test_type=='SAT'].set_index(user_id)
df1.columns = ['SAT_date']
df2["test_SAT"]="SAT"
finaldf = pd.concat([df1, df2], axis=1).reset_index()
#create temporary column
#and set index
res = (df.assign(temp = df.test_type)
.set_index(['user_id','temp'])
)
#unstack
#remove unnecessary column level
#and rename columns
(res.unstack()
.droplevel(0,axis=1)
.set_axis(['test_ACT','test_SAT','ACT_date','SAT_date'],axis=1)
)
test_ACT test_SAT ACT_date SAT_date
user_id
1 ACT NaN 2013-08-15 NaN
2 ACT NaN 2011-12-09 NaN
3 NaN SAT NaN 2012-03-09
4 ACT SAT 2003-07-27 2013-12-31
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.