简体   繁体   中英

How to extract the min/max values in a dataframe to display data as a ribbon?

I have several sets of data stored in a data frame. For the sake of this question, I provide below a way to generate this data frame, but IRL, I only have the merged data frame, not the intermediate ones.

x <- seq.POSIXt(from = strptime("1970-01-01 00:00:00", format = "%Y-%m-%d %H:%M:%S"),
                to = strptime("1970-01-01 00:05:00", format = "%Y-%m-%d %H:%M:%S"),
                by = "10 sec")

x <- rep(x, each = 3)
y <- c()

set.seed(1)

for (i in 1:length(x)) {
  y <- c(y, runif(1, min = 0, max = i))
}

my.data.frame1 <- data.frame(x, y, data = as.factor("data1"))

y <- c()
for (i in 1:length(x)) {
  y <- c(y, runif(1, min = length(x) - i, max = length(x)))
}

my.data.frame2  <- data.frame(x, y, data = as.factor("data2"))

merged <- rbind(my.data.frame1, my.data.frame2)

ggplot(merged, aes(x, y, color = data)) + geom_point() + geom_line()

So for each type of data (data1 and data2), and for each date value on the x axis, I have 3 y values.

The plot looks (bad) like this:

在此处输入图片说明

What I want to do is to plot a geom_ribbon of the data but I don't know how to do it.

I first tried to extract the min and max values with an aggregate function as explained here for each time and build a new data frame without duplicate x values but couldn't make it work.

Can anyone help?

Edit:

The code I tried with aggregate is this one:

aggregate(y ~ x, data = merged, max)

(Same for the min). But this does not make the difference between the data1 set and the data2 set. I know I could subset, but I guess it can be done using the "by" argument. Just couldn't make it work.

You were on the right track, and need to aggregate by both data and x instead of just x .

You can either calculate the min and max by group separately in two aggregate calls and then merge or do both at the same time. For the second approach you'll need an additional step to get the output of the two functions into separate columns.

my.new.df = aggregate(y ~ data + x, data = merged, FUN = function(x) c(min = min(x), max = max(x)))

# Get the min and max as separate columns
my.new.df = as.data.frame(as.list(my.new.df))

ggplot(my.new.df, aes(x, fill = data)) + 
    geom_ribbon(aes(ymin = y.min, ymax = y.max), alpha = 0.6)

You can also make the plot directly using stat = "summary" in geom_ribbon instead of making an aggregate dataset for plotting.

ggplot(merged, aes(x, y, fill = data)) + 
    geom_ribbon(alpha = 0.6, stat = "summary", fun.ymax = max, fun.ymin = min)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM