简体   繁体   English

当pandas中除两列以外的所有列的值匹配时,如何替换列的值?

[英]How to replace values of a column when values of all columns except two match in pandas?

I have a dataframe that look like this:我有一个如下所示的数据框:

       iv_1  iv_2  iv_3  iv_4  iv_5  col2rplc  idenifier
0       0      0     0     0     0      a          1
333     0      0     0     0     0      b          0
      ......
222     1      2     3     4     5      aa         1
324     1      2     3     4     5      cc         0
      ......
1234    1      0     0     0     1      a          1
1235    0      2     0     4     0      a          1
1236    0      0     3     0     0      a          1
1237    0      0     1     0     0      b          o
1238    0      2     0     2     0      b          o
1239    3      0     0     0     3      b          o

This is two pandas dataframes concatenated.这是连接的两个熊猫数据框。 And identifier column identifies which set a particular row is from, set_1 or set_0.标识符列标识特定行来自哪个集合,set_1 或 set_0。 I would like to replace values of the column col2rplc in all the rows that have same values for all the columns of a set_0 with that of set_1.我想用col2rplc值替换col2rplc的所有列具有相同值的所有行中列col2rplc值。 So, in the above example, for the first two rows, I would like b to be replaced with a;因此,在上面的示例中,对于前两行,我希望将 b 替换为 a; and i would like cc to be replaced with aa;我希望将 cc 替换为 aa; while all the remaining rows of column col2rplc , where I don't have same values in rows, stay intact.col2rplc列的所有剩余行(其中我的行中没有相同的值)保持不变。

How do I do this?我该怎么做呢?

Use duplicated to identify duplicates rows then mask and ffill :使用duplicated来识别重复行,然后maskffill

# sort the data accodringly
df = df.sort_values(['iv_1','iv_2','iv_3','iv_4','iv_5', 'idenifier'],
                    ascending=False)

mask = df.duplicated(df.columns[:5])
df['col2rplc'] = df['col2rplc'].mask(mask).ffill()

Output (notice you have an extra duplicate in the last few rows that you didn't mention in your question):输出(请注意,您在问题中未提及的最后几行中有一个额外的重复项):

      iv_1  iv_2  iv_3  iv_4  iv_5 col2rplc  idenifier
0        0     0     0     0     0        a          1
222      1     2     3     4     5       aa          1
324      1     2     3     4     5       aa          0
333      0     0     0     0     0        a          0
1234     1     0     0     0     1        a          1
1235     0     2     0     2     0        a          1
1236     0     0     3     0     0        a          1
1237     0     0     1     0     0        b          0
1238     0     2     0     2     0        a          0
1239     3     0     0     0     3        b          0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM