避免 Python DataFrame 中的 for 循环

Question

问题1。

假设我有 n 年的年回报r并且我的初始财富是100 。 每年我有固定开支6。我想创造每年的财富。 我可以在 for 循环中做到这一点。 但就我的目的而言，这很耗时。 如何在 DataFrame 中做到这一点？

wealth = pd.Series(index = range(n+1))
wealth[0] = 100
for i in range(n):
    wealth.iloc[i+1] = wealth.iloc[i]*(1+r.iloc[i]) - 6

一开始我以为

wealth = ((1 + r - 0.06).cumprod()).multiply(other = 100)

成为解决方案。 但事实并非如此。 费用不是6%。 它们是固定的。 现在是 6。

问题 2。

我想做以上N次。 在每种情况下，我通过对 n 个返回进行采样并替换来生成r 。

r = returnY.sample(n,replace=True).reset_index(drop=True)

然后对于那个回报，创建我上面描述的财富路径，并创建一个 *N 财富路径的日期框架。 我可以在 for 循环中执行此操作，但是对于大N和n ，运行需要很长时间。 有没有一种有效而优雅的方法来做到这一点？

问题 3。

假设allWealth是所有财富路径的 DF。 想要检查每行中的 %columns 小于 0。这就是我解决它的方法。

yy = allWealth.copy()
yy[yy>0] = 1
yy[yy<=0] = 0
yy.sum(axis = 1)/N

有更好、更优雅的解决方案吗？

Answer 1

问题 1：看起来您想应用“减少”模式。 您可以使用functools中的reduce function 。

import numpy as np
from functools import reduce
rs = np.random.random(50)*0.3   #sequence of annual returns
result = reduce(lambda w,r: w*(1+r)-6, rs, 100)

如果要保留所有中间值，请改用itertools.accumulate() 。 例如，将最后一行替换为以下内容：

ts_iter= itertools.accumulate(rs, lambda w,r: w*(1+r)-6, initial=100)
ts = list(ts_iter)     #itertools.accumulate returns an iterable

问题2：可以先通过放回抽样生成nxN的随机矩阵。 然后你可以对每一列使用“apply_along_axis”方法。

import numpy as np
rm = np.random.random((n,N))
def sim(rs):
    return reduce(lambda w,r: w * (1+r) - 6, rs, 100)
result = np.apply_along_axis(sim, 0, rm)

问题 3：您不需要为原始 dataframe 分配 1 和 0。 在这种情况下， True和False的掩码 dataframe 隐式充当 dataframe 的一和零。

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((50,30)))
mask = df < 0.5
mask.sum(axis=1)/30

Answer 2

我使用@chi 的解决方案进行了一些小的编辑。

import numpy as np
import itertools

rm = np.random.random((n,N))   #sequence of annual returns
rm0 = np.insert(rm, 0, 100, axis=1)

def wealth(rs):
    return list(itertools.accumulate(rs, lambda w,r: w*(1+r)-6))

result = np.apply_along_axis(wealth, 1, rm0)

itertools.accumulate 不识别初始值。 因此在返回数组的前面插入了初始财富。

避免 Python DataFrame 中的 for 循环

问题描述

2 个解决方案

解决方案1
1 2021-03-30 03:24:08

解决方案2
0 2021-04-09 20:46:09

避免 Python DataFrame 中的 for 循环

问题描述

2 个解决方案

解决方案1 1 2021-03-30 03:24:08

解决方案2 0 2021-04-09 20:46:09

解决方案1
1 2021-03-30 03:24:08

解决方案2
0 2021-04-09 20:46:09