[英]Conditional replacing of column values in dataframe?
I have a dataframe and I want to replace some values in that column based on a condition. 我有一个数据框,我想根据条件替换该列中的某些值。 My dataframe looks like this 我的数据框看起来像这样
ID customer_name arrival_month leaving_month
1524 ABC 201508 201605
1185 XYZ 201701 201801
8456 IJK 201801 201902
I am trying a simple operation here. 我正在尝试一个简单的操作。 I want to change the values in leaving_month column by currentmonth value =201802 where leaving_month>201802. 我想通过当月值= 201802更改Leave_month列中的值,其中Leave_month> 201802。 I have tried by .loc and it gives the error below. 我已经尝试过.loc,它给出了以下错误。
df.loc[(df['leaving_month'] > 201802)] = 201802
KeyError: 'leaving_month'
I have also tried np.where which also gives an error. 我也尝试过np.where这也给一个错误。
df['leaving_month']=np.where(df['leaving_month']>currentmonth, currentmonth)
KeyError: 'leaving_month'
I have also tried with brute looping 我也尝试过蛮力循环
for o in range(len(df)):
if(df.loc[o,'leaving_month']>currentmonth):
df.loc[o,'leaving_month']=currentmonth
IndexingError: Too many indexers
Can someone please point me in the right direction or figure out what am I doing wrong or suggest a better solution? 有人可以指示我正确的方向,还是找出我做错了什么,或提出更好的解决方案? This is quite simple problem but somehow I am not getting through. 这是一个非常简单的问题,但是我却无法解决。
You are replacing an entire row. 您要替换整行。 Instead, set a specific column with the .loc
. 而是使用.loc
设置特定的列。 See the second indexer in the solution below. 请参阅下面的解决方案中的第二个索引器。
df.loc[df['leaving_month'] > 201802, 'leaving_month'] = 201802
df
returns 回报
ID customer_name arrival_month leaving_month
0 1524 ABC 201508 201605
1 1185 XYZ 201701 201801
2 8456 IJK 201801 201802
You can read about DataFrame indexing in the Pandas docs . 您可以在Pandas文档中阅读有关DataFrame索引的信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.