简体   繁体   中英

Understanding a line of R code

There is a blog post on R bloggers how to transform stock prices in R to returns. It will transform the first row to 1 and then track the returns going forward. So they have this example of stock data:

##            AAPL.Close MSFT.Close GOOG.Close
## 2016-01-04     105.35      54.80     741.84
## 2016-01-05     102.71      55.05     742.58
## 2016-01-06     100.70      54.05     743.62
## 2016-01-07      96.45      52.17     726.39
## 2016-01-08      96.96      52.33     714.47
## 2016-01-11      98.53      52.30     716.03

When you load a library magittr (which apparently is for a "pipe" operator) and you run this line of code:

stock_return <- apply(stocks, 1, function(x) {x / stocks[1,]}) %>% t %>% as.xts

you get this:

           AAPL.Close MSFT.Close GOOG.Close
2016-01-04  1.0000000  1.0000000  1.0000000
2016-01-05  0.9749407  1.0045620  1.0009975
2016-01-06  0.9558614  0.9863139  1.0023994
2016-01-07  0.9155197  0.9520073  0.9791734
2016-01-08  0.9203607  0.9549271  0.9631052
2016-01-11  0.9352634  0.9543796  0.9652081

I don't understand how this line of code works. I know the apply function will operate on each row (the parameter 1 accomplishes that). I know that in theory I want to divide the first row by itself (which will give 1 for that row) and then divide each succeeding row by the first row which will show how the 1.00 investment changes over time.

So this part of the code is the function:

{x / stocks[1,]}) %>% t %>% 

It has something to do with a "pipe operator" and the operations working left to right instead of inside out. Can someone help me understand the syntax of this function and how it accomplishes what it is supposed to? I can just use it but I would rather not have it be a black box. Thanks!

First thing to note is that this code only works if the input is a matrix. To make it reproducible read in the data as follows:

library(magrittr)
library(xts)
df <- read.table(text =" ,AAPL.Close, MSFT.Close, GOOG.Close
            2016-01-04,     105.35,      54.80,     741.84
            2016-01-05,     102.71,      55.05,     742.58
            2016-01-06,     100.70,      54.05,     743.62
            2016-01-07,      96.45,      52.17,     726.39
            2016-01-08,      96.96,      52.33,     714.47
            2016-01-11,      98.53,      52.30,     716.03", 
            sep = ",", header = TRUE, row.names = 1)
stock <- as.matrix(df)

If you struggle to understand how the whole pipe works at once just split it into its components.

The first part is the apply function.

apply(df, 1, function(x) {x / df[1,]})

As you correctly stated it works on the rows of the matrix meaning
it takes each row of the martix and passes them seperatly to the provided function (there is actually a for loop working in the background). An easy case for a function would be mean .

apply(stock, 1, mean)

This would simply compute the mean of each row. The author of the blog did something slightly more complicated and provided the anonymous function (a function just written for this special task) function(x) {x / stock[1,]} .

So sequentially each row of the matrix is passed as the first argument to the provided function. The function has only one argument x . So you can think of x as a vector representing one row of your original matrix. To figure out what the function is doing look at its body x / stock[1,] .

x represents a row of the original matrix and stock[1, ] is the first row of the original matrix. Hence each row of the matrix is one after another devided by the first row of the matrix and the results are added as columns to a new matrix which looks like this

                       2016-01-04             2016-01-05             2016-01-06
AAPL.Close                      1              0.9749407              0.9558614
MSFT.Close                      1              1.0045620              0.9863139
GOOG.Close                      1              1.0009975              1.0023994
                       2016-01-07             2016-01-08             2016-01-11
AAPL.Close              0.9155197              0.9203607              0.9352634
MSFT.Close              0.9520073              0.9549270              0.9543796
GOOG.Close              0.9791734              0.9631053              0.9652081

Compared to the final output here rows and columns are transposed. If you transopse this matrix using t (the second function in the pipe) you get the desired output matrix. The last function as.xts simply transforms the matrix to a special type of time series object, a xts object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM