简体   繁体   English

根据另一列中的条件在 Pandas 数据框中设置值

[英]Setting Values in Pandas Dataframe Based on Condition in Another Column

I am looking to update the values in a pandas series that satisfy a certain condition and take the corresponding value from another column.我希望更新满足特定条件的熊猫系列中的值,并从另一列中获取相应的值。

Specifically, I want to look at the subcluster column and if the value equals 1, I want the record to update to the corresponding value in the cluster column.具体来说,我想查看subcluster cluster列,如果值等于 1,我希望记录更新为cluster列中的相应值。

For example:例如:

Cluster Subcluster子集群
3 3 1 1
3 3 2 2
3 3 1 1
3 3 4 4
4 4 1 1
4 4 2 2

Should result in this应该导致这个

Cluster Subcluster子集群
3 3 3 3
3 3 2 2
3 3 3 3
3 3 4 4
4 4 4 4
4 4 2 2

I've been trying to use apply and a lambda function, but can't seem to get it to work properly.我一直在尝试使用 apply 和 lambda 函数,但似乎无法正常工作。 Any advice would be greatly appreciated.任何建议将不胜感激。 Thanks!谢谢!

You can use np.where :您可以使用np.where

import numpy as np

df['Subcluster'] = np.where(df['Subcluster'].eq(1), df['Cluster'], df['Subcluster'])

Output:输出:

    Cluster  Subcluster
0         3           3
1         3           2
2         3           3
3         3           4
4         4           4
5         4           2

In your case try mask在你的情况下尝试mask

df.Subcluster.mask(lambda x : x==1, df.Cluster,inplace=True)
df
Out[12]: 
   Cluster  Subcluster
0        3           3
1        3           2
2        3           3
3        3           4
4        4           4
5        4           2

Or或者

df.loc[df.Subcluster==1,'Subcluster'] = df['Cluster']

Really all you need here is to use .loc with a mask (you don't actually need to create the mask, you could apply a mask inline)在这里,您真正需要的只是将 .loc 与掩码一起使用(您实际上不需要创建掩码,您可以内联应用掩码)

df = pd.DataFrame({'cluster':np.random.randint(0,10,10)
                    ,'subcluster':np.random.randint(0,3,10)}
                 )
df.to_clipboard(sep=',')

df at this point df此时

,cluster,subcluster
0,8,0
1,5,2
2,6,2
3,6,1
4,8,0
5,1,1
6,0,0
7,6,0
8,1,0
9,3,1

create and apply the mask (you could do this all in one line)创建并应用蒙版(您可以在一行中完成所有操作)

mask = df.subcluster == 1
df.loc[mask,'subcluster'] = df.loc[mask,'cluster']
df.to_clipboard(sep=',')

final output:最终输出:

,cluster,subcluster
0,8,0
1,5,2
2,6,2
3,6,6
4,8,0
5,1,1
6,0,0
7,6,0
8,1,0
9,3,3

Here's the lambda you couldn't write.这是您无法编写的 lambda。 In lamba, x corresponds to the index, so you can use that to refer a specific row in a column.在 Lamba 中, x对应于索引,因此您可以使用它来引用列中的特定行。

df['Subcluster'] = df.apply(lambda x: x['Cluster'] if x['Subcluster'] == 1 else x['Subcluster'], axis = 1)

And the output:和输出:

    Cluster Subcluster
0   3       3
1   3       2
2   3       3
3   3       4
4   4       4
5   4       2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据条件修改 Pandas dataFrame 列值 - Modify Pandas dataFrame column values based on a condition Pandas数据框根据查询数据框中的值选择行,然后根据列值选择其他条件 - Pandas Dataframe Select rows based on values from a lookup dataframe and then another condition based on column value 基于多个条件检查,将值放在pandas dataframe中的列中,来自另一个数据帧 - Putting values in a column in pandas dataframe from another dataframe based on multiple condition check 如何根据另一列的日期条件获取熊猫数据框中特定列的值? - How do I get the values of a particular column in a pandas dataframe based on a date condition on another column? 根据条件从另一个数据框设置数据框列的值 - Setting value for dataframe column from another dataframe based on condition 根据条件将一个 dataframe 中的列值设置为另一个 dataframe 列 - Setting value of columns in one dataframe to another dataframe column based on condition 根据条件用一个python pandas dataframe列的值替换为另一个python pandas dataframe列的值 - Substitute the values of one python pandas dataframe column by values from another based on a condition Pandas 基于另一个 DataFrame 修改列值 - Pandas modify column values based on another DataFrame Pandas 根据列值将 Dataframe 划分为另一个 - Pandas Divide Dataframe by Another Based on Column Values 根据条件将一个 dataframe 列的值分配给另一个 dataframe 列 - assign values of one dataframe column to another dataframe column based on condition
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM