简体   繁体   English

Pandas Dataframe 迭代并根据条件选择行 - 需求变化

[英]Pandas Dataframe iteration and selecting the rows based on condition - Change in Requirements

I have sorted data frame as mentioned below(Input DataFrame) and I need to iterate the rows,select & retrive the rows into output data frame based on below conditions.我已经按照下面提到的方式对数据帧进行了排序(输入数据帧),我需要迭代行,根据以下条件选择并将行检索到输出数据帧中。

• Condition 1: For a given R1,R2,W - if we have two records with TYPE 'A' and 'B' a) If (amoun1& amount2) of TYPE 'A' is > (amoun1& amount2 )of TYPE 'B' we need to bring the TYPE 'A' record into the output b) If (amoun1& amount2) of TYPE 'B' is > (amoun1& amount2 )of TYPE 'A' we need to bring the TYPE 'B' record into the output c) If (amoun1& amount2) of TYPE 'A' is = (amoun1& amount2 )of TYPE 'B' we need to bring the TYPE 'A' record into the output • 条件 1:对于给定的 R1,R2,W - 如果我们有两个记录分别为 TYPE 'A' 和 'B' a) 如果 TYPE 'A' 的 (amoun1& amount2) is > (amoun1& amount2 )of TYPE 'B'我们需要将 TYPE 'A' 记录带入输出 b) 如果 TYPE 'B' 的 (amoun1& amount2) is > (amoun1& amount2 )of TYPE 'A' 我们需要将 TYPE 'B' 记录带入输出 c ) 如果 TYPE 'A' 的 (amoun1& amount2) is = (amoun1& amount2 )of TYPE 'B' 我们需要将 TYPE 'A' 记录带入输出

• Condition 2: For a given R1,R2,W - if we have only record with TYPE 'A', we need to bring the TYPE 'A' record into the output • Condition 3: For a given R1,R2,W - if we have only record with TYPE 'B', we need to bring the TYPE 'B' record into the output Input Dataframe • 条件 2:对于给定的 R1,R2,W - 如果我们只有 TYPE 'A' 的记录,我们需要将 TYPE 'A' 记录带入输出 • 条件 3:对于给定的 R1,R2,W -如果我们只有 TYPE 'B' 的记录,我们需要将 TYPE 'B' 记录带入输出 Input Dataframe

    R1  R2  W   TYPE    amount1 amount2
0   123 12  1   A   111 222
1   123 12  1   B   111 222
2   123 12  2   A   222 222
3   123 12  2   B   333 333
4   123 12  3   A   444 444
5   123 12  3   B   333 333
6   123 34  1   A   111 222
7   123 34  2   A   333 444
8   123 34  2   B   333 444
9   123 34  3   B   444 555
10  123 34  4   A   555 666
11  123 34  4   B   666 777

Output dataframe输出数据帧

    R1  R2  W   TYPE    amount1 amount1
0   123 12  1   A   111 222
3   123 12  2   B   333 333
4   123 12  3   A   444 444
6   123 34  1   A   111 222
7   123 34  2   A   333 444
9   123 34  3   B   444 555
11  123 34  4   B   666 777

Selection based on your criteria's根据您的标准进行选择

def my_selection(idf):
  # If 'A' and 'B' in 'TYPE' then give me the row with 'A'
  if idf['TYPE'].unique().shape[0] == 2:
    return idf[idf['TYPE'] == 'A']
  else:
    return idf

df2 = df.groupby(['R1', 'R2', 'W'], as_index=False).apply(lambda idf: my_selection(idf))
df2.index = df2.index.droplevels(-1)

#     R1  R2  W TYPE  amount1  amount2
# 0  123  12  1    A      111      222
# 1  123  12  2    A      333      444
# 2  123  12  3    A      555      666
# 3  123  34  1    A      111      222
# 4  123  34  2    A      222      333
# 5  123  34  3    B      444      555
# 6  123  34  4    A      555      666

All you have to do is groupby R1,R2,W and operate on Type column as follows:您所要做的就是 groupby R1,R2,W 并对 Type 列进行操作,如下所示:

data.groupby(['R1','R2','W']).apply(lambda x: 'A' if 'A' in x['Type'].values else 'B').reset_index() 

You can merge this output with original DataFrame on the obtained columns from the above output to get corresponding 'amount1', 'amount2' values您可以将此输出与从上述输出获得的列上的原始 DataFrame 合并以获得相应的“amount1”、“amount2”值

This is what I would do:这就是我会做的:

categories =  ['B','A'] #create a list of categories in ascending order of precedence
d={i:e for e,i in enumerate(categories)} #create a dictionary:{'A': 0, 'B': 1}
s=df['TYPE'].map(d) #map to df['TYPE'] and create a helper series

then assign this series to the dataframe and groupby+transform max and check if it is equal to the helper series and return where both value matches:然后将此系列分配给数据框和groupby+transform max 并检查它是否等于辅助系列并返回两个值匹配的位置:

out = df[s.eq(df.assign(TYPE=s).groupby(['R1','R2','W'])['TYPE'].transform('max'))]
print(out)

     R1  R2  W TYPE  amount1  amount2
0   123  12  1    A      111      222
2   123  12  2    A      333      444
4   123  12  3    A      555      666
6   123  34  1    A      111      222
7   123  34  2    A      222      333
9   123  34  3    B      444      555
10  123  34  4    A      555      666

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM