[英]how replace NaN columns with calculated CAGR values
i have a dataframe with NaN values.我有一个带有 NaN 值的数据框。 i want to replace that NaN values to CAGR values我想将 NaN 值替换为 CAGR 值
val1 val2 val3 val4 val5
0 100 100 100 100 100
1 90 110 80 110 50
2 70 150 70 NaN NaN
3 NaN NaN NaN NaN NaN
CAGR(compound annual growth rate) = (end value / first value) ** (1/number of years) CAGR(复合年增长率)=(终值/初值)**(1/年数)
for example, val1's CAGR is -23%.例如,val1 的 CAGR 为 -23%。 so the last value of val1 will be 53.9所以 val1 的最后一个值将是 53.9
the column val4's CAGR value is 10%列 val4 的 CAGR 值为 10%
so row2 NaN will be 121 and row3 NaN replace as 133所以 row2 NaN 将是 121 和 row3 NaN 替换为 133
how can i replace NaN automatically?如何自动替换 NaN?
the questions is问题是
1) how can i calculate CAGR each columns? 1)如何计算每列的复合年增长率?
i used isnull() so, i found which row is empty.我使用 isnull() 所以,我发现哪一行是空的。 but i don't know how to except the row to calculate CAGR.但我不知道如何计算 CAGR 的行。
2) how can i replace NaN with calculated values? 2) 如何用计算值替换 NaN?
thank you.谢谢你。
from __future__ import division # for python2.7
import numpy as np
# tab delimited data
a = '''100 100 100 100 100
90 110 80 110 50
70 150 70 NaN NaN
NaN NaN NaN NaN NaN
'''
# parse and make a numpy array
data = np.array( [[np.nan if aaa=='NaN' else int(aaa) for aaa in aa.split('\t')] for aa in a.splitlines()] )
for col in range(5):
Nyears = np.isnan(data[:,col]).argmax()-1 # row index for the last non-NaN value
endvalue = data[Nyears,col]
cagr = (endvalue / 100) ** (1 / Nyears)
print Nyears, endvalue, cagr
for year in np.argwhere(np.isnan(data[:,col])):
data[year,col] = data[year-1,col] * cagr
print data
I get:我得到:
[[ 100. 100. 100. 100. 100. ]
[ 90. 110. 80. 110. 50. ]
[ 70. 150. 70. 121. 25. ]
[ 58.56620186 183.71173071 58.56620186 133.1 12.5 ]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.