简体   繁体   English

将 for() 循环转换为 R 中的 apply() 函数,用于自定义 XIRR function

[英]Converting for() loops to apply() functions in R for a custom XIRR function

I've currently been looking for functions to calculate the XIRR of a large database of cashflows/balances in R, and I've come across this function, which I've been trying to modify to fit my code:我目前一直在寻找计算 R 中大型现金流/余额数据库的 XIRR 的函数,我遇到了这个function,我一直在尝试修改它以适合我的代码:

library(tidyverse)

xirr2 <- function(exflow, date) {
    if(as.numeric(max(date) - min(date)) <= 2) {
        return(0)
    } else if(abs(sum(exflow, na.rm = TRUE)) < 1e-12) {
        return(0)
    } else {
        npv <- function(range, exflow, date){
            for(test.rate in range) {
                temp <- as.data.frame(cbind(exflow, date)) %>%
                    mutate(npv = exflow * ((1 + test.rate/100)^(as.numeric(max(date) - date)/365))) %>%
                    select(npv) %>%
                    .[1]
                if(sum(exflow, na.rm = TRUE) > 0) {
                    if(sum(temp, na.rm = TRUE) > 0) {
                        min.rate <- test.rate
                        next
                    } else {
                        max.rate <- test.rate
                        break
                    }
                } else {
                    if(sum(temp, na.rm = TRUE) < 0) {
                        min.rate <- test.rate
                        next
                    } else {
                        max.rate <- test.rate
                        break
                    }
                }
            }
            return(list(min.rate = min.rate, max.rate = max.rate))
        }
        max.rate <- c()
        min.rate <- c()
        if(sum(exflow, na.rm = TRUE) >= 1e-12) {
            range <- seq(from = 0, to = 1e8, by = 1e3)    
            hundreds <- npv(range, exflow, date)
            range <- seq(from = hundreds$min.rate, to = hundreds$max.rate, by = 10)
            tens <- npv(range, exflow, date)
            range <- seq(from = tens$min.rate, to = tens$max.rate, by = 1)
            ones <- npv(range, exflow, date)
            range <- seq(from = ones$min.rate, to = ones$max.rate, by = 0.01)
            decimals <- npv(range, exflow, date)
            return(mean(unlist(decimals))/100)
        } else {
            range <- seq(from = 0, to = -1e8, by = -1e3)
            hundreds <- npv(range, exflow, date)
            range <- seq(from = hundreds$min.rate, to = hundreds$max.rate, by = -10)
            tens <- npv(range, exflow, date)
            range <- seq(from = tens$min.rate, to = tens$max.rate, by = -1)
            ones <- npv(range, exflow, date)
            range <- seq(from = ones$min.rate, to = ones$max.rate, by = -0.01)
            decimals <- npv(range, exflow, date)
            return(mean(unlist(decimals))/100) 
        }
    }
}

Basically, given a vector of cashflows and a vector of corresponding dates, this function returns the annualized XIRR of a investment.基本上,给定现金流量向量和相应日期向量,这个 function 返回投资的年化 XIRR。

While it works great and generates consistently correct answers when cross-referenced with MS Excel and LibreOffice Calc, it is a bit on the slower side, and I feel that it could be improved by replacing the for() loop with an apply() function or something from the data.table package. The speed issues are barely noticeable on small examples, but on large datasets like mine with a ton of edge cases, the slow-down can be pretty substantial.虽然它工作得很好并且在与 MS Excel 和 LibreOffice Calc 交叉引用时生成始终如一的正确答案,但它有点慢,我觉得可以通过将for()循环替换为apply() function 来改进它或者来自data.table package 的东西。速度问题在小例子中几乎不明显,但在像我这样有大量边缘情况的大型数据集上,速度下降可能相当大。

For what it's worth, I've tried a multitude of other XIRR functions from various packages, including tvm , FinancialMath , and FinCal .对于它的价值,我已经尝试了来自各种包的大量其他 XIRR 函数,包括tvmFinancialMathFinCal For whatever reason, these functions tend to break down over time: solutions eventually stop converging and become inaccurate, particularly with large cashflows and rapid changes between positive/negative returns.无论出于何种原因,这些功能往往会随着时间的推移而崩溃:解决方案最终会停止收敛并变得不准确,尤其是在现金流量大且正/负回报之间快速变化的情况下。 This might possibly be due to a common reliance on the uniroot() or polyroot() functions in R to calculate XIRR, but I'm not sure.这可能是由于普遍依赖 R 中的uniroot()polyroot()函数来计算 XIRR,但我不确定。

In any case, the above function actually gets me the numbers I want---I just need some help optimizing it for larger datasets.无论如何,上面的 function 实际上得到了我想要的数字——我只需要一些帮助来优化它以适应更大的数据集。 Thank you in advance!先感谢您!

EDIT编辑

Thank you for the help so far.感谢您到目前为止的帮助。 Here are some minimum examples:以下是一些最低限度的示例:

Some deposits, some withdrawals, and then a complete withdrawal for a positive return.一些存款,一些提款,然后完全提款以获得正回报。 MS Excel shows XIRR = 15.32%: MS Excel 显示 XIRR = 15.32%:

> flow1 <- c(-1000,-100,100,1200)
> date1 <- as.Date(c("2018-01-01","2018-10-31","2019-03-31","2019-03-31"), format = "%Y-%m-%d")
> tvm::xirr(flow1,date1)
Error in uniroot(xnpv, interval = interval, cf = cf, d = d, tau = tau,  : 
  f.lower = f(lower) is NA
> xirr2(flow1,date1)
[1] 0.15315

An account receiving regular contributions with a poor return.一个接受定期捐款但回报不佳的账户。 MS Excel shows XIRR = -27.54%: MS Excel 显示 XIRR = -27.54%:

> flow2 <- c(-200,-200,-200,-200,-200,800)
> date2 <- as.Date(c("2018-01-01","2018-03-01","2018-06-01","2018-09-01","2019-01-01","2019-03-01"), format = "%Y-%m-%d")
> tvm::xirr(flow2,date2)
Error in uniroot(xnpv, interval = interval, cf = cf, d = d, tau = tau,  : 
  f.lower = f(lower) is NA
> xirr2(flow2,date2)
[1] -0.27535

Maybe I'm just using tvm::xirr() wrong?也许我只是在使用tvm::xirr()错了? I'm not sure how to correct that uniroot() error.我不确定如何纠正uniroot()错误。

OK, I figured it out thanks to this answer.好的,由于这个答案,我明白了。 Turns out that tvm::xirr() does work well (and is significantly faster than the above function), but I had been calling it incorrectly.事实证明tvm::xirr()确实运行良好(并且比上述函数快得多),但我一直在错误地调用它。 Here is a working example:这是一个工作示例:

> flow2 <- c(-200,-200,-200,-200,-200,800)
> date2 <- as.Date(c("2018-01-01","2018-03-01","2018-06-01","2018-09-01","2019-01-01","2019-03-01"), format = "%Y-%m-%d")
> tvm::xirr(flow2, date2, comp_freq = 1, maxiter = 100, tol = 1e-8, lower = -0.999, upper = 100)
[1] -0.2753857

I had tried this earlier with lower = -1 , which gives the same error as above.我之前用lower = -1试过这个,它给出了与上面相同的错误。 So I was off by 0.001... so close.所以我偏离了 0.001... 非常接近。 Thank you all again for your help!再次感谢大家的帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM