I have an input dataframe like given below
df = pd.DataFrame({'person_id' :[1,1,1,2,2,2,2,2,2],'level_1': ['L1FR','L1Date','L1value','L1FR','L1Date','L1value','L2FR','L2Date','L2value'], 'val3':['Fasting','11/4/2005',1.33,'Random','18/1/2007',4.63,'Fasting','18/1/2017',8.63]})
It looks like as shown below
I would like to convert my output to look like as shown below
This is what I tried based on response from SO forum
g = df.level_1.str[-2:]
df.pivot_table(index='person_id', columns=g, values='val3',aggfunc='first')
This provides only the first record of each group (or person) like shown below which is very close to my expected output.Basically I am trying to do sort of transformation/rearrange the way input data is shown
but I want to have all the records. In real data, I have more than 100k records and each person can be repeated multiple times.
When I use g.groupby(g).cumcount()
as index, I get NA's for half of the records and I would like to avoid it. I don't know why this issue happens.
Any other way to approach this is also helpful. Can you help me get this output?
You're pretty close to have what you want.
The trick here is to add another index to your second level like this :
df = pd.DataFrame({'person_id' :[1,1,1,2,2,2,2,2,2],'level_1': ['L1FR','L1Date','L1value','L1FR','L1Date','L1value','L2FR','L2Date','L2value'], 'val3':['Fasting','11/4/2005',1.33,'Random','18/1/2007',4.63,'Fasting','18/1/2017',8.63]})
g = df.level_1.str[-2:]
# Extracting level's number
df['lvl'] = df.level_1.apply(lambda x: int(''.join(filter(str.isdigit, x))))
# Then you pivot with person_id and lvl
df = df.pivot_table(index=['person_id', 'lvl'], columns=g, values='val3', aggfunc='first')
Output should be :
level_1 FR te ue
person_id lvl
1 1 Fasting 11/4/2005 1.33
2 1 Random 18/1/2007 4.63
2 Fasting 18/1/2017 8.63
Then if you reset the level 1 index like this :
df.reset_index(level=1).drop("lvl", axis=1)
The output is :
level_1 FR te ue
person_id
1 Fasting 11/4/2005 1.33
2 Random 18/1/2007 4.63
2 Fasting 18/1/2017 8.63
And there you go !
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.