根据条件更改熊猫中的列值

Question

df: DF：

I am trying to create a new column quantile based on which quantile the value falls in, for example: 我正在尝试基于值所属的分位数创建新的列分位数，例如：

if value > 1st quantile : value = 1
if value > 2nd quantile : value = 2
if value > 3rd quantile : value = 3
if value > 4th quantile : value = 4

Code: 码：

f_q = df['A'] .quantile (0.25)
s_q = df['A'] .quantile (0.5)
t_q = df['A'] .quantile (0.75)
fo_q = df['A'] .quantile (1)


index = 0
for i  in range(len(test_df)):

   value = df.at[index,"A"]
   if value > 0 and value <= f_q:
       df.at[index,"A"] = 1

   elif value > f_q and value <= s_q:
       df.at[index,"A"] = 2

   elif value > s_q and value <= t_q:
       df.at[index,"A"] = 3

   elif value > t_q and value <= fo_q:
       df.at[index,"A"] = 4


   index += 1

The code works fine. 该代码工作正常。 But I would like to know if there is a more efficient pandas way of doing this. 但是我想知道是否有更有效的方法来做到这一点。 Any suggestions are helpful. 任何建议都是有帮助的。

Answer 1

Yes, using pd.qcut : 是的，使用pd.qcut ：

>>> pd.qcut(df.A, 4).cat.codes + 1
0    1
1    3
2    2
3    4
4    1
5    4
6    4
7    3
8    2
9    1
dtype: int8

(Gives me exactly the same result your code does.) （给我与您的代码完全相同的结果。）

You could also call np.unique on the qcut result: 您还可以在qcut结果上调用np.unique ：

>>> np.unique(pd.qcut(df.A, 4), return_inverse=True)[1] + 1
array([1, 3, 2, 4, 1, 4, 4, 3, 2, 1])

Or, using pd.factorize (note the slight difference in the output): 或者，使用pd.factorize （请注意输出中的细微差别）：

>>> pd.factorize(pd.qcut(df.A, 4))[0] + 1
array([1, 2, 3, 4, 1, 4, 4, 2, 3, 1])

根据条件更改熊猫中的列值

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-11-10 03:19:07

根据条件更改熊猫中的列值

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-11-10 03:19:07

解决方案1
2 已采纳 2018-11-10 03:19:07