I have a data frame that looks like this sx16 data frame:
Incase the link doesnt work:
The data frame is called sx16
It has column names: Date, Open, High, Low, Settle
I want to add a column called up_period that prints a 1 if the below calc is positive and a 0 if the below calc is negative:
sx16$Settle[ 1: nrow(sx16)] - sx16$Settle[ 2: nrow(sx16)]
Of course, this produces an error as the new list is shorter than the original sx16.
I have tried to wrap rbind.fill around it like so:
sx16$up_period <- rbind.fill(sx16$Settle[ 1: nrow(sx16)] - sx16$Settle[ 2: nrow(sx16)])
But this produces the following error:
Warning message: In sx16$Settle[1:nrow(sx16)] - sx16$Settle[2:nrow(sx16)] : longer object length is not a multiple of shorter object length
Of course, that is exactly what I thought rbind.fill would solve. Here is where I am stuck. Once I get this, I can add a simple if-else to do the 1 and 0, but I cannot figure out how to add this shorter column to my data frame.
试试这个(最后的up_period没有定义):
sx16$up_period <- sx16$Settle - c(sx16$Settle[-1],NA)
You can use lead
from the dplyr
package:
library(dplyr)
result <- sx16 %>% mutate(up_period=as.numeric((Settle-lead(Settle,default=NA)) > 0))
## Date Open High Low Settle up_period
##1 2016-09-30 950.00 958.50 943.00 954.00 1
##2 2016-09-29 947.00 957.25 946.00 950.25 1
##3 2016-09-28 951.75 955.75 944.50 945.50 0
##4 2016-09-27 946.75 953.50 934.00 952.50 1
##5 2016-09-26 951.50 960.25 943.75 945.25 0
##6 2016-09-23 975.00 976.25 952.50 955.00 NA
Here, we explicitly set the default
parameter for lead
to NA
to fill in the value at the end to show that we can set this to another value such as the last value if we want. Note that there is also no need to use an if-else
as we can convert the boolean to 1,0
using as.numeric
.
The dput
for your data is:
sx16 <- structure(list(Date = structure(c(17074, 17073, 17072, 17071,
17070, 17067), class = "Date"), Open = c(950, 947, 951.75, 946.75,
951.5, 975), High = c(958.5, 957.25, 955.75, 953.5, 960.25, 976.25
), Low = c(943, 946, 944.5, 934, 943.75, 952.5), Settle = c(954,
950.25, 945.5, 952.5, 945.25, 955)), .Names = c("Date", "Open",
"High", "Low", "Settle"), row.names = c(NA, -6L), class = "data.frame")
I'm surprised nobody mentioned diff
yet. diff(sx16$Settle)
is the equivalent of sx16$Settle[2:nrow(sx16)] - sx16$Settle[1:(nrow(sx16)-1)]
. So the following would work for you:
sx16$up_period <- c(ifelse(diff(sx16$Settle)<0, 1, 0), NA)
I'll use the iris data set:
x <- iris
dummy <- x$Sepal.Length #repeat column again but rename dummy
dummy[length(dummy)+1]=0 #add a value of 0 to the end for the day thats not happened yet
dummy <- dummy[2:length(dummy)] #translate the column to match the original for calculation
x <- cbind(x,dummy) #add the column to the data
x$up <- x$Sepal.Length-x$dummy #new calculated column
x$dummy <- NULL #remove dummy
So essentially, I added your column again, translated it down one position and then calculated using that dummy column.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.