按列分组，然后比较另外两列并返回不同列中的值

Question

I have a dataframe similar to this我有一个与此类似的数据框

    data={'COMB':["PNR1", "PNR1", "PNR11", "PNR2", "PNR2"],
        'FROM':["MAA", "BLR", "DEL", "TRV", "HYD"],
         'TO':["BLR", "MAA", "MAA", "HYD", "TRV"]}
md=pd.DataFrame(data)
md

What I want to do is to create another column based on the condition that if the From of one row is equal to the To of the next row, then it sholud return "R" otherwise it will return "O" in the new column.我想要做的是根据条件创建另一列，如果一行的 From 等于下一行的 To，那么它应该返回“R”，否则它将在新列中返回“O”。 My final output should look like this.我的最终输出应该是这样的。

Can anyone help me in python.任何人都可以在 python 中帮助我。 I tried following method, but it gives me error我尝试了以下方法，但它给了我错误

md_merged=(md>>
            group_by('COMB')>>
            mutate(TYPE=np.where(md['FROM'].isin(md['TO']),"R","O"))>>
           ungroup)

ValueError: Length of values does not match length of index Please help. ValueError：值的长度与索引的长度不匹配请帮助。

Answer 1

This solution compare all values between groups, not only prvious and next.此解决方案比较组之间的所有值，而不仅仅是上一个和下一个。

You can use custom lambda function in GroupBy.apply for boolean mask, for avoid MultiIndex is added group_keys=False to DataFrame.groupby , last set new values in numpy.where :您可以在GroupBy.apply使用自定义 lambda 函数作为布尔掩码，以避免将MultiIndex添加到DataFrame.groupby group_keys=False ，最后在numpy.where设置新值：

mask = md.groupby('COMB', group_keys=False).apply(lambda x: x['FROM'].isin(x['TO']))
md = md.assign(Type=np.where(mask,"R","O"))
print (md)
    COMB FROM   TO Type
0   PNR1  MAA  BLR    R
1   PNR1  BLR  MAA    R
2  PNR11  DEL  MAA    O
3   PNR2  TRV  HYD    R
4   PNR2  HYD  TRV    R

This solution compare previous and next rows per groups:此解决方案比较每组的前一行和下一行：

Another idea is use DataFrameGroupBy.shift , it should be faster like groupby.apply :另一个想法是使用DataFrameGroupBy.shift ，它应该像groupby.apply一样groupby.apply ：

mask = (md.groupby('COMB')['FROM'].shift().eq(md['TO']) | 
        md.groupby('COMB')['TO'].shift(-1).eq(md['FROM']))

md = md.assign(Type=np.where(mask,"R","O"))
print (md)
    COMB FROM   TO Type
0   PNR1  MAA  BLR    R
1   PNR1  BLR  MAA    R
2  PNR11  DEL  MAA    O
3   PNR2  TRV  HYD    R
4   PNR2  HYD  TRV    R

Answer 2

Play with numpy.玩麻木。 Take md into numpy, sort columns other than COMB and find all duplicated.将 md 带入 numpy，对 COMB 以外的列进行排序并查找所有重复的列。 Conditionally name the duplicated.有条件地命名重复项。

s =md.to_numpy()
s[:,1:3]=np.sort(s[:,1:3])
md['Type'] =np.where(pd.DataFrame(s).duplicated(keep=False),'R','0')

compare consecutive values and use np.where to impose the Type.比较连续值并使用 np.where 强加类型。 Code below.代码如下。 Worked for me.为我工作。

md['Type'] =np.where(md.groupby('COMB',as_index=False).apply(lambda x: (x['FROM']==x['TO'].shift())|(x['FROM'].shift(-1)==x['TO'])),'R','O')



   COMB FROM   TO Type
0   PNR1  MAA  BLR    R
1   PNR1  BLR  MAA    R
2  PNR11  DEL  MAA    O
3   PNR2  TRV  HYD    R
4   PNR2  HYD  TRV    R

按列分组，然后比较另外两列并返回不同列中的值

问题描述

1 个解决方案

解决方案1
1 2021-11-12 06:36:01

解决方案2
0 2021-11-12 06:41:28

按列分组，然后比较另外两列并返回不同列中的值

问题描述

1 个解决方案

解决方案1 1 2021-11-12 06:36:01

解决方案2 0 2021-11-12 06:41:28

解决方案1
1 2021-11-12 06:36:01

解决方案2
0 2021-11-12 06:41:28