根据条件替换Pandas Dataframe中的值

Question

I have a dataframe column with some numeric values. 我有一个带有一些数值的数据帧列。 I want that these values get replaced by 1 and 0 based on a given condition. 我希望根据给定条件将这些值替换为1和0。 The condition is that if the value is above the mean of the column, then change the numeric value to 1, else set it to 0. 条件是如果该值高于列的平均值，则将数值更改为1，否则将其设置为0。

Here is the code I have now: 这是我现在的代码：

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('data.csv')
dataset = dataset.dropna(axis=0, how='any')

X = dataset.drop(['myCol'], axis=1)
y = dataset.iloc[:, 4:5].values

mean_y = np.mean(dataset.myCol)

The target is the dataframe y. 目标是数据帧y。 y is like so: 你是这样的：

and so on. 等等。 mean_y is equal to 3.55. mean_y等于3.55。 Therefore, I need that all values greater than 3.55 to become ones, and the rest 0. 因此，我需要将大于3.55的所有值变为1，其余为0。

I applied this loop, but without success: 我应用了这个循环，但没有成功：

for i in dataset.myCol:
    if dataset.myCol[i] > mean_y:
        dataset.myCol[i] = 1
    else:
        dataset.myCol[i] = 0

The output is the following: 输出如下：

What am I doing wrong? 我究竟做错了什么？ Can someone please explain me the mistake? 有人可以解释我的错误吗？

Thank you! 谢谢！

Answer 1

试试这种矢量化方法：

dataset.myCol = np.where(dataset.myCol > dataset.myCol.mean(), 1, 0)

Answer 2

Convert boolean mask to integer - True s to 1 and False s to 0 : 将布尔掩码转换为整数 - True s为1 ， False为0 ：

print (dataset.myCol > mean_y)
0     True
1    False
2    False
3    False
Name: myCol, dtype: bool

dataset.myCol = (dataset.myCol > mean_y).astype(int)
print (dataset)
   myCol
0      1
1      0
2      0
3      0

For your aproach, not recommended because slow need iterrows for set values by columns and index values: 为了您的形式给出，不推荐，因为慢需要iterrows由列和指标值的设定值：

for i, x in dataset.iterrows():
    if dataset.loc[i, 'myCol'] > mean_y:
        dataset.loc[i, 'myCol'] = 1
    else:
        dataset.loc[i, 'myCol'] = 0

根据条件替换Pandas Dataframe中的值

问题描述

2 个解决方案

解决方案1
5 2018-04-16 12:34:33

解决方案2
2 2018-04-16 12:34:47

根据条件替换Pandas Dataframe中的值

问题描述

2 个解决方案

解决方案1 5 2018-04-16 12:34:33

解决方案2 2 2018-04-16 12:34:47

解决方案1
5 2018-04-16 12:34:33

解决方案2
2 2018-04-16 12:34:47