argsort() 仅将正负值分开并添加一个新的 pandas 列

Question

I have a dataframe that has column, 'col', with both positive and negative numbers.我有一个 dataframe，它有列“col”，有正数和负数。 I would like run a ranking separately on both the positive and negative numbers only with 0 excluded not to mess up the ranking.我想分别对正数和负数进行排名，只排除 0，以免搞乱排名。 My issue is that my code below is updating the 'col' column.我的问题是我下面的代码正在更新“col”列。 I must be keeping a reference it but not sure where?我必须保留它的参考但不确定在哪里？

data = {'col':[random.randint(-1000, 1000) for _ in range(100)]}
df = pd.DataFrame(data)

pos_idx = np.where(df.col > 0)[0]
neg_idx = np.where(df.col < 0)[0]
p = df[df.col > 0].col.values
n = df[df.col < 0].col.values
p_rank = np.round(p.argsort().argsort()/(len(p)-1)*100,1)
n_rank = np.round((n*-1).argsort().argsort()/(len(n)-1)*100,1)
pc = df.col.values
pc[pc > 0] = p_rank
pc[pc < 0] = n_rank
df['ranking'] = pc

Answer 1

was able to figure it out on my own.能够自己弄清楚。

created a new column of zeros then used.loc to update te value at their respective index locations.创建了一个新的零列，然后使用 .loc 更新各自索引位置的 te 值。

df['ranking'] = 0 
df[df.col > 0, 'ranking'] = pos_rank
df[df.col < 0, 'ranking'] = neg_rank

Answer 2

One way to do it is to avoid mutating the original dataframe by replacing this line in your code:一种方法是通过替换代码中的这一行来避免改变原始 dataframe：

pc = df.col.values

with:和：

pc = df.copy().col.values

So that:以便：

print(df)
# Output
    col  ranking
0  -492       49
1   884       93
2  -355       36
3   741       77
4  -210       24
..  ...      ...
95  564       57
96  683       63
97 -129       18
98 -413       44
99  810       81

[100 rows x 2 columns]

argsort() 仅将正负值分开并添加一个新的 pandas 列

问题描述

2 个解决方案

解决方案1
0 2023-01-28 13:23:38

解决方案2
0 2023-01-29 08:03:29

argsort() 仅将正负值分开并添加一个新的 pandas 列

问题描述

2 个解决方案

解决方案1 0 2023-01-28 13:23:38

解决方案2 0 2023-01-29 08:03:29

解决方案1
0 2023-01-28 13:23:38

解决方案2
0 2023-01-29 08:03:29