简体   繁体   中英

Using Lagged Values Conditionally in R

What I want to do is take the split_coefficient value in the rows with the split_coefficient !=1 to be used in calculations with the adjusted_close for the prior dates in the data frame. I'm trying to create a loop in R that will multiple the adjusted_close values by the split_coefficient up to but not including the row which contains split_coefficient that != 1 and repeat the process to the end of the data set. I am able to identify those rows with split_coefficients != 1 using which(y[,6] !=1 , but cannot figure out how to write the loops to accomplish this task. Any help on how to create this loop would be greatly appreciated. Thank you in advance.

timestamp   open    high    low close   adjusted_close  split_coefficient
7/20/2018   31.61   31.72   30.95   31.04   31.04   1
7/19/2018   31.17   31.57   30.69   31.19   31.19   1
7/18/2018   30.53   31.33   30.26   30.63   30.63   1
7/17/2018   31.67   31.825  30.49   30.89   30.89   1
7/16/2018   31.24   31.79   31  31.23   31.23   1
7/13/2018   32.06   32.37   31.36   31.45   31.45   1
7/12/2018   32.29   32.68   31.69   31.69   31.69   1
7/11/2018   33.37   33.47   32.43   32.93   32.93   1
7/10/2018   32.19   32.8185 31.75   31.84   31.84   1
7/9/2018    33.32   33.37   32.249  32.48   32.48   0.25
7/6/2018    36.03   36.17   34.15   34.23   34.23   1
7/5/2018    36.47   37.46   36.05   36.09   36.09   1
7/3/2018    36.28   37.8299 36  37.33   37.33   1
7/2/2018    38.74   39.22   37.03   37.08   37.08   1
6/29/2018   36.71   37.06   35.78   37  37  1
6/28/2018   38.88   40.51   37.46   38.03   38.03   0.35
6/27/2018   36.14   39.43   35.21   38.56   38.56   1
6/26/2018   36.54   37.89   35.715  36.48   36.48   1
6/25/2018   34.24   39.745  34.24   38.11   38.11   1
6/22/2018   33.04   33.57   32.72   33.06   33.06   1
6/21/2018   32.26   34.84   32.21   34.15   34.15   1
6/20/2018   32.13   32.21   31.655  32.02   32.02   0.5
6/19/2018   33.33   33.92   32.43   32.79   32.79   1
6/18/2018   32.55   33.02   31.19   31.24   31.24   1
6/15/2018   31.94   32.52   31.52   31.67   31.67   1
6/14/2018   31.5    31.83   30.91   31.33   31.33   1
6/13/2018   31.58   32.45   31.44   32.39   32.39   1
6/12/2018   31.86   32.41   31.66   31.97   31.97   1
6/11/2018   32.67   32.77   31.91   32.09   32.09   1
6/8/2018    33.46   33.56   32.41   32.6    32.6    1

I'll try to clarify my question: On 6/20/18, the split coefficient is .50. What I want to do is multiple the split_coefficient of .5 by the adjusted_close values from 6/8/18 to 6/19/18. The split_coefficient then changes to .35 on 6/28/18 where I want to multiple the Adjusted_close from 6/21/18 to 6/27/18 by .35. Since the split_coefficient changes periodically, I thought a loop or series of loops would accomplish this.

Based on what I wrote above, I am looking for the following output with anew column named New.adj.Close which will contain the values calculated when multiplying the split_coefficient from 6/20/18 on the adjusted_close values for 6/8/18 - 6/19/18:

timestamp   open    high    low close   adjusted_close  dividend_amount split_coefficient   New.Adj.close
6/19/2018   33.33   33.92   32.43   32.79   32.79   0   1   16.395
6/18/2018   32.55   33.02   31.19   31.24   31.24   0   1   15.62
6/15/2018   31.94   32.52   31.52   31.67   31.67   0   1   15.835
6/14/2018   31.5    31.83   30.91   31.33   31.33   0   1   15.665
6/13/2018   31.58   32.45   31.44   32.39   32.39   0   1   16.195
6/12/2018   31.86   32.41   31.66   31.97   31.97   0   1   15.985
6/11/2018   32.67   32.77   31.91   32.09   32.09   0   1   16.045
6/8/2018    33.46   33.56   32.41   32.6    32.6    0   1   16.3

To clarify, do you just want to multiply adjusted_close by split_coefficient for the observations where split_coefficient equals 1? If so,

library(dplyr)
y %>% filter(split_coefficient == 1) %>% mutate(new_col = split_coefficient *adjusted_close)

Apologies if I misunderstood the question.

As highlighted in the comments, using loops in R is usually avoided and better alternatives are available. For example you can use ifelse :

df <-
  data.frame(
    adjusted_close = sample(1:5, 10, TRUE),
    split_coefficient = sample(1:2, 10, TRUE)
  )

#    adjusted_close split_coefficient
# 1               5                 1
# 2               2                 2
# 3               3                 2
# 4               2                 2
# 5               4                 2
# 6               5                 2
# 7               1                 1
# 8               2                 1
# 9               2                 2
# 10              2                 1

df$m <- ifelse(df$split_coefficient == 1,
               df$adjusted_close, 
               df$adjusted_close * df$split_coefficient
               )

# df
#    adjusted_close split_coefficient  m
# 1               5                 1  5
# 2               2                 2  4
# 3               3                 2  6
# 4               2                 2  4
# 5               4                 2  8
# 6               5                 2 10
# 7               1                 1  1
# 8               2                 1  2
# 9               2                 2  4
# 10              2                 1  2

Okay this uses the tidyverse but you can recode it to use base r or whatever. The important thing is the logic. As mentioned you do not normally want to use loops for a task like this, and in this case you would have to do a do while loop. Instead take advantage of vectorization.

measure_date <- seq(as.Date("2000/1/1"), by = "day", length.out = 20)
pattern <- c(.5, 1,1,1,1)
split_coefficient <- c(pattern, pattern, pattern, pattern)
value_to_multiply <- c(1:20)

df <- data.frame(measure_date, value_to_multiply, split_coefficient)

# doing this because OP's data is reversed
df <- dplyr::arrange(df, measure_date)

# Change the 1s to NAs.

df$newsplit <- ifelse(df$split_coefficient == 1, NA, df$split_coefficient)

df <- tidyr::fill(df , newsplit)
df$multiplied <- df$value_to_multiply*df$newsplit
df

Results

   measure_date value_to_multiply split_coefficient newsplit multiplied
1    2000-01-01                 1               0.5      0.5        0.5
2    2000-01-02                 2               1.0      0.5        1.0
3    2000-01-03                 3               1.0      0.5        1.5
4    2000-01-04                 4               1.0      0.5        2.0
5    2000-01-05                 5               1.0      0.5        2.5
6    2000-01-06                 6               0.5      0.5        3.0
7    2000-01-07                 7               1.0      0.5        3.5
8    2000-01-08                 8               1.0      0.5        4.0
9    2000-01-09                 9               1.0      0.5        4.5
10   2000-01-10                10               1.0      0.5        5.0
11   2000-01-11                11               0.5      0.5        5.5
12   2000-01-12                12               1.0      0.5        6.0
13   2000-01-13                13               1.0      0.5        6.5
14   2000-01-14                14               1.0      0.5        7.0
15   2000-01-15                15               1.0      0.5        7.5
16   2000-01-16                16               0.5      0.5        8.0
17   2000-01-17                17               1.0      0.5        8.5
18   2000-01-18                18               1.0      0.5        9.0
19   2000-01-19                19               1.0      0.5        9.5
20   2000-01-20                20               1.0      0.5       10.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM