简体   繁体   中英

How to shade the area between two time series only if value A is greater than value B in ggplot?

I am trying to chart 2 time series (indexed stock prices) on ggplot , AAPL and MSFT. I want to shade the the area between these two lines but only when the AAPL indexed price is higher than that of MSFT. How do I accomplish this?

I have been reading about using geom_ribbon() but saw that some people said it is problematic and doesn't work when the two lines do not cross. I also have not been able to get the code to work. How do I set my ymin and ymax values for geom_ribbon() ? I tried geom_area() as well but then all I created was a stacked area graph.

Here is my code so far:

install.packages("tidyquant")
install.packages("ggplot2")

library(tidyquant)
library(ggplot2)

symbols <- c("AAPL", "MSFT")
data <- tq_get(symbols, get = "stock.prices", from = "2016-01-01")

S1_index <-data$adjusted[which(data$symbol == "AAPL" & data$date == min(data$date))] 
S2_index <-data$adjusted[which(data$symbol == "MSFT" & data$date == min(data$date))] 

data$adjusted <- ifelse(data$symbol == "AAPL", data$adjusted/S1_index,
                        ifelse(data$symbol == "MSFT", data$adjusted/S2_index,NA))

ggplot(data,aes(x=date, y=adjusted,colour= symbol)) +
  geom_line() +
  scale_colour_manual(values = c(AAPL = "darkblue", MSFT = "red")) +
  ggtitle("Title Here") + xlab("X Axis Label Here") + ylab("Y Axis Label Here") + 
  theme(plot.title = element_text(hjust = 0.5)) +
  scale_x_date(date_labels = "%b %y", date_breaks = "6 months") + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
        panel.background = element_blank(), axis.line = element_line(colour = "black"))+
  labs(color = "Company") + 
  theme(legend.title = element_blank())

I would like the area between two time series to be shaded when AAPL is higher than MSFT, but currently my code doesn't accomplish that. I'm not very proficient in using ggplot, so I would appreciate any advice you might have.

You can use a ribbon to show the area between the two lines, but it'll require a bit of tweaking to only show the area when AAPL is higher than MSFT. Assume data is the link to the .csv file you've posted and the dates were formatted. First, we're going to build a seperate data.frame in a typical ribbon-like format:

ribbondata <- data.frame(
  # We'll keep the x-values for one of the lines
  x = data$date[data$symbol == "AAPL"],
  # Next we are going to take the pairwise minima and maxima along the lines
  ymin = pmin(data$adjusted[data$symbol == "AAPL"], data$adjusted[data$symbol == "MSFT"]),
  ymax = pmax(data$adjusted[data$symbol == "AAPL"], data$adjusted[data$symbol == "MSFT"]),
  # Then, we'll save a variable for which observations to keep
  keep = data$adjusted[data$symbol == "AAPL"] > data$adjusted[data$symbol == "MSFT"]
)

Then here is how I would filter out regions we do not want to shade, and also attach some id variable to the data that keeps track of stretches of data that we do indeed want to shade. We'll use run length encoding for this:

keep_rle <- rle(ribbondata$keep)
# Now we'll replace every TRUE with a counter integer
keep_rle$values[keep_rle$values] <- seq_len(sum(keep_rle$values))

Next, we'll attach the inverse of this run length encoded id to the ribbondata dataframe and remove the bits where ribbondata$KEEP == FALSE .

ribbondata$id <- inverse.rle(keep_rle)
ribbondata <- ribbondata[ribbondata$keep,]

Then, we'll use the plotting code you provided:

g <- ggplot(data,aes(x=date, y=adjusted,colour= symbol)) +
  geom_line() +
  scale_colour_manual(values = c(AAPL = "darkblue", MSFT = "red")) +
  ggtitle("Title Here") + xlab("X Axis Label Here") + ylab("Y Axis Label Here") + 
  theme(plot.title = element_text(hjust = 0.5)) +
  scale_x_date(date_labels = "%b %y", date_breaks = "6 months") + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
        panel.background = element_blank(), axis.line = element_line(colour = "black"))+
  labs(color = "Company") + 
  theme(legend.title = element_blank())

And attach our ribbondata to it:

g <- g + geom_ribbon(data = ribbondata, 
                     aes(x = x, ymin = ymin, ymax = ymax, group = id), 
                     inherit.aes = FALSE)

Now the trick here is to attach our calculated id variable to the group in the aes() call, so that ggplot doesn't interpret the ribbon as a continuous object and draw weird lines at x-values where ribbondata y-values are undefined. Also I've set inherit.aes = FALSE because the ribbondata has different names for x and ymin/ymax variables than the main data .

I ended up with the following plot:

在此处输入图片说明

Of course, you can give the ribbon any fill colour or alpha that you want. Good luck!

First reshape your data.

data <- data %>% 
    
# Select down to the necessary columns
select(date, symbol, adjusted) %>%
        
# Pivot to create columns for both symbols
pivot_wider(names_from = symbol, values_from = adjusted) %>%

# Create new variables for ribbon
mutate(max1 = ifelse(AAPL >= MSFT, AAPL, MSFT)) %>%
mutate(max2 = ifelse(MSFT >= AAPL, MSFT, AAPL)) 

Next, create your ggplot object

g1 <- data %>%

    # Set PlotAesthetics
    ggplot(aes(x=date, y=AAPL)) + 
    
    # First ribbon creates the color above MSFT and below AAPL
    geom_ribbon(aes(ymin=MSFT, ymax=AAPL), fill="grey", alpha=0.9) +
    
    # Second ribbon removes anything below MSFT
    geom_ribbon(aes(ymin=0, ymax=MSFT), fill="white", alpha=0.9) +
    
    # Add lines for AAPL and MSFT 
    geom_line(aes(y=AAPL), color = "blue") +
    geom_line(aes(y=MSFT), color = "red") +
    
    # Create Labels
    labs(x = "X Axis Label Here", y = "Y Axis Label Here",
         title = "Title Here") + 

    # Set Theme to match your original plot
    theme_classic() +

    # Need to create custom legend 
    annotate(geom = "text", x = ymd('2020-06-01'), y = .25, label = "AAPL", hjust = "left") +
    annotate(geom = "segment", x = ymd('2020-03-01'), xend = ymd('2020-05-01'), y = .25, yend = .25, colour = "blue", size = 1) +
    annotate(geom = "text", x = ymd('2020-06-01'), y = .05, label = "MSFT", hjust = "left") +
    annotate(geom = "segment", x = ymd('2020-03-01'), xend = ymd('2020-05-01'), y = .05, yend = .05, colour = "red", size = 1)

I realise this is a bit late, but is an alternative approach to achieving what @bgm was after.

Here is the associated plot

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM