With the following data frame, I would like to shift data. Grouping by 'ID' and 'Series', the data in columns Q, R and T should be shifted down to the row where 'Status' is End.
data = pd.DataFrame({
'ID': ['A','A','A','B','B','B','B','C','C','C','C','C','D','D'],
'Series': [1,1,1,1,1,2,2,1,1,2,2,2,1,1],
'Status': ['Begin','Begin','End','Begin','End','Begin','End','Begin','End','Begin','Begin','End','Begin','End'],
'Q':[9,'','',30,'',14,'',3,'',17,'','',1,''],
'R': ['',8,'','','','','','','','',7,'','',''],
'T': ['','',12,'',38,'',21,'',6,'','',35,'',5]
})
The result should be as follows:
result = pd.DataFrame({
'ID': ['A','A','A','B','B','B','B','C','C','C','C','C','D','D'],
'Series': [1,1,1,1,1,2,2,1,1,2,2,2,1,1],
'Status': ['Begin','Begin','End','Begin','End','Begin','End','Begin','End','Begin','Begin','End','Begin','End'],
'Q':['','',9,'',30,'',14,'',3,'','',17,'',1],
'R': ['','',8,'','','','','','','','',7,'',''],
'T': ['','',12,'',38,'',21,'',6,'','',35,'',5]
})
Use GroupBy.transform
+ GroupBy.first
for find first non NaN
s value and then remove duplicated values by mask
and duplicated
:
cols = ['Q', 'R', 'T']
#repalce emty strings to NaNs
data[cols] = data[cols].astype(str).replace('', np.nan)
print (data)
ID Series Status Q R T
0 A 1 Begin 9 NaN NaN
1 A 1 Begin NaN 8 NaN
2 A 1 End NaN NaN 12
3 B 1 Begin 30 NaN NaN
4 B 1 End NaN NaN 38
5 B 2 Begin 14 NaN NaN
6 B 2 End NaN NaN 21
7 C 1 Begin 3 NaN NaN
8 C 1 End NaN NaN 6
9 C 2 Begin 17 NaN NaN
10 C 2 Begin NaN 7 NaN
11 C 2 End NaN NaN 35
12 D 1 Begin 1 NaN NaN
13 D 1 End NaN NaN 5
g = data.groupby(['ID', 'Series'])
for c in cols:
data[c] = g[c].transform('first')
print (data)
ID Series Status Q R T
0 A 1 Begin 9 8 12
1 A 1 Begin 9 8 12
2 A 1 End 9 8 12
3 B 1 Begin 30 NaN 38
4 B 1 End 30 NaN 38
5 B 2 Begin 14 NaN 21
6 B 2 End 14 NaN 21
7 C 1 Begin 3 NaN 6
8 C 1 End 3 NaN 6
9 C 2 Begin 17 7 35
10 C 2 Begin 17 7 35
11 C 2 End 17 7 35
12 D 1 Begin 1 NaN 5
13 D 1 End 1 NaN 5
data[cols] = data[cols].mask(data.duplicated(['ID','Series'], keep='last'), '').fillna('')
print (data)
ID Series Status Q R T
0 A 1 Begin
1 A 1 Begin
2 A 1 End 9 8 12
3 B 1 Begin
4 B 1 End 30 38
5 B 2 Begin
6 B 2 End 14 21
7 C 1 Begin
8 C 1 End 3 6
9 C 2 Begin
10 C 2 Begin
11 C 2 End 17 7 35
12 D 1 Begin
13 D 1 End 1 5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.