[英]Treating NaN as 0 when adding and NaN when dividing
I have a sparse dataframe where I need to do some column operations involving adding and weighted averaging.我有一个稀疏的 dataframe ,我需要在其中进行一些涉及添加和加权平均的列操作。
df
A B C D E F
0 NaN NaN NaN 30 15 25
1 15 25 35 NaN NaN NaN
2 NaN NaN NaN 35 10 15
3 10 20 35 NaN NaN NaN
Now I have to create three new columns where one is (A + D)
, one is ((A * B) + (D * E)) / (A + D)
, and the last one is ((A * B * C) + (D * E * F)) / ((A * B) + (D * E))
.现在我必须创建三个新列,其中一列是
(A + D)
,一列是((A * B) + (D * E)) / (A + D)
,最后一列是((A * B * C) + (D * E * F)) / ((A * B) + (D * E))
。 The problem I am having is that when I treat NaN as 0 then when I divide I get errors because you cant divide by 0 but when I perform operations with NaN then my result is NaN.我遇到的问题是,当我将 NaN 视为 0 时,我会得到错误,因为你不能除以 0,但是当我使用 NaN 执行操作时,我的结果是 NaN。 I tried just writing a conditional but I get an error because its a column so that doesnt work.
我试着只写一个条件,但我得到一个错误,因为它是一个列,所以它不起作用。 I didnt want to do it a row at a time because speed matter.
我不想一次做一排,因为速度很重要。 Any help would be greatly appreciated.
任何帮助将不胜感激。
I read in your data and used .fillna(0)
to replace NaN
values.我读入了您的数据并使用
.fillna(0)
替换了NaN
值。 Is that an option?这是一个选择吗?
import pandas as pd
df = pd.read_clipboard().fillna(0)
df
A B C D E F
0 0.0 0.0 0.0 30.0 15.0 25.0
1 15.0 25.0 35.0 0.0 0.0 0.0
2 0.0 0.0 0.0 35.0 10.0 15.0
3 10.0 20.0 35.0 0.0 0.0 0.0
Then I simply inputted your formulas:然后我只是输入了你的公式:
df['G'] = (df['A'] + df['D'])
df['H'] = ((df['A'] * df['B']) + (df['D'] * df['E'])) / (df['A'] + df['D'])
df['I'] = ((df['A'] * df['B'] * df['C']) + (df['D'] * df['E'] * df['F'])) / ((df['A'] * df['B']) + (df['C'] * df['D']))
df
Is this what you are expecting below?这是您在下面所期待的吗?
A B C D E F G H I
0 0.0 0.0 0.0 30.0 15.0 25.0 30.0 15.0 inf
1 15.0 25.0 35.0 0.0 0.0 0.0 15.0 25.0 35.0
2 0.0 0.0 0.0 35.0 10.0 15.0 35.0 10.0 inf
3 10.0 20.0 35.0 0.0 0.0 0.0 10.0 20.0 35.0
Finally, if you want to change inf
values to nan values, you can use df.replace(np.inf, np.nan)
最后,如果要将
inf
值更改为 nan 值,可以使用df.replace(np.inf, np.nan)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.