I have a dataframe that looks like this:
bucket type v
0 0 X 14
1 1 X 10
2 1 Y 11
3 1 X 15
4 2 X 16
5 2 Y 9
6 2 Y 10
7 3 Y 20
8 3 X 18
9 3 Y 15
10 3 X 14
The desired output looks like this:
bucket type v v_paired
0 1 X 14 nan (no Y coming before it)
1 1 X 10 nan (no Y coming before it)
2 1 Y 11 14 (highest X in bucket 1 before this row)
3 1 X 15 11 (lowest Y in bucket 1 before this row)
4 2 X 16 nan (no Y coming before it in the same bucket)
5 2 Y 9 16 (highest X in same bucket coming before)
6 2 Y 10 16 (highest X in same bucket coming before)
7 3 Y 20 nan (no X coming before it in the same bucket)
8 3 X 18 20 (single Y coming before it in same bucket)
9 3 Y 15 18 (single Y coming before it in same bucket)
10 3 X 14 15 (smallest Y coming before it in same bucket)
The goal is to construct the v_paired column, and the rules are as follows:
Look for rows in the same bucket, coming before this one, that have opposite type(X vs Y), call these 'pair candidates'
If the current row is X, choose the min. v out of the pair candidates to become v_paired for the current row, if the current row is Y, choose the max. v out of the pair candidates to be the v_paired for the current row
Thanks in advance.
I believe this should be done in a sequential manner... first group by bucket
groups = df.groupby('bucket', group_keys=False)
this function will be applied to each bucket group
def func(group):
y_value = None
x_value = None
result = []
for _, (_, value_type, value) in group.iterrows():
if value_type == 'X':
x_value = max(filter(None,(x_value, value)))
result.append(y_value)
elif value_type == 'Y':
y_value = min(filter(None,(y_value, value)))
result.append(x_value)
return pd.DataFrame(result)
df['v_paired'] = groups.apply(func)
hopefuly this will do the job
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.