[英]argsort() only positive and negative values separately and add a new pandas column
I have a dataframe that has column, 'col', with both positive and negative numbers.我有一个 dataframe,它有列“col”,有正数和负数。 I would like run a ranking separately on both the positive and negative numbers only with 0 excluded not to mess up the ranking.
我想分别对正数和负数进行排名,只排除 0,以免搞乱排名。 My issue is that my code below is updating the 'col' column.
我的问题是我下面的代码正在更新“col”列。 I must be keeping a reference it but not sure where?
我必须保留它的参考但不确定在哪里?
data = {'col':[random.randint(-1000, 1000) for _ in range(100)]}
df = pd.DataFrame(data)
pos_idx = np.where(df.col > 0)[0]
neg_idx = np.where(df.col < 0)[0]
p = df[df.col > 0].col.values
n = df[df.col < 0].col.values
p_rank = np.round(p.argsort().argsort()/(len(p)-1)*100,1)
n_rank = np.round((n*-1).argsort().argsort()/(len(n)-1)*100,1)
pc = df.col.values
pc[pc > 0] = p_rank
pc[pc < 0] = n_rank
df['ranking'] = pc
was able to figure it out on my own.能够自己弄清楚。
created a new column of zeros then used.loc to update te value at their respective index locations.创建了一个新的零列,然后使用 .loc 更新各自索引位置的 te 值。
df['ranking'] = 0
df[df.col > 0, 'ranking'] = pos_rank
df[df.col < 0, 'ranking'] = neg_rank
One way to do it is to avoid mutating the original dataframe by replacing this line in your code:一种方法是通过替换代码中的这一行来避免改变原始 dataframe:
pc = df.col.values
with:和:
pc = df.copy().col.values
So that:以便:
print(df)
# Output
col ranking
0 -492 49
1 884 93
2 -355 36
3 741 77
4 -210 24
.. ... ...
95 564 57
96 683 63
97 -129 18
98 -413 44
99 810 81
[100 rows x 2 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.