简体   繁体   中英

How to make multiple plots for positive and negative outcomes only in R?

Lets say I have the following data frame:

ID     amount_ID   timespan    change
3      1           20          2
3      2           40          3
3      3           60          6
3      4           80          4
3      5           100         5
9      1           25          1
9      2           50          -2
9      3           75          0
9      4           100         -1
3      1           33.33       4
3      2           66.67       8
3      3           100         7
9      1           33.33       1
9      2           66.67       3
9      3           100         4

Ronak Shah helped me to create 2 plots. The code for these 2 plots are shown below. These 2 plots show the average change per ID on the y-axis and the timespan on the axis.

library(ggplot2)
library(dplyr)

df %>%
  arrange(ID, timespan) %>%
  group_by(ID) %>%
  mutate(change = cummean(change)) %>%
  ggplot() + aes(timespan, change) + 
  geom_line() + 
  facet_wrap(.~ID, scales = "free_y")

The 2 plots

Now I strumbled upon a new problem. How can I make 3 new plots where we look at last change per ID and look whether this number is positive or negative. So, we have 1 plot for the specific ID if the last change is positive and 1 plot for the specific ID if the last change is negative. In the df we see that ID 3 ends positive 2 times and ID 9 ends both negative and positive. So in total this would give us 3 new plots for the average change per ID.

Thanks!

The best way to do this is via facet_grid , which I'll demonstrate below. However, you may want to think hard about what is your level of replication in this experiment. It looks from the data that you have multiple trials for each ID, but your calculation of cummean is grouping all of these together, which means you cannot plot each trial as you have aggregated them. I've kept the data broken up across trials in this exercise, as it otherwise would not make much sense.

First, create some example data. Note that you should try not to use df as a name for a dataframe, as this already refers to a function in R.

library(ggplot2)
library(dplyr)

#create some data to use. Note some variable to specify which trial it is
dd <- expand.grid(ID = 1:5,
                  trial = 1:10,
                  timespan = c(20,40,60,80,100))
dd$change <- sample(-10:10, size = nrow(dd), replace = T)
dd <- arrange(dd, ID, trial, timespan)

The data looks like this:

 ID trial timespan change
1   1     1       20     -9
2   1     1       40     -3
3   1     1       60      5
4   1     1       80     -3
5   1     1      100     10
6   1     2       20      7
7   1     2       40     10
8   1     2       60     -6
9   1     2       80    -10
10  1     2      100     -4

Then create a column last_change which signifies whether a trial ended on a negative or positive change. We do this by indexing the change for each ID x trial to be the one equal to the last/largest timespan for that trial. This means even if some trials run for different amounts of time, this code will still work.

dd <- dd %>%
  group_by(ID, trial) %>%
  mutate(last_change = ifelse(change[timespan == max(timespan)]<0, "neg", "pos")) %>%
  ungroup()

In order to get 3 facets in the plot, we'll need to duplicate all of the data and assign it a value of "all" in the last_change column.

#repeat all of the data with a last_change value of "all" for the third plot
dd.dup <- mutate(dd, last_change = "all")

dd.all <- bind_rows(dd, dd.dup)

We can then feed this dataset into the code you had before. Note that I have added trial and last_change to the group_by call because these are the units of replication. You will need to think about if that makes sense for your experimental design.

dd.all %>%
  arrange(ID, trial, timespan) %>%
  group_by(ID, trial, last_change) %>%
  mutate(change = cummean(change)) %>%
  ungroup() %>%
  ggplot(aes(x = timespan, y = change)) +
  geom_line(aes(group = trial)) +
  facet_grid(ID~last_change, scales = "free_y")

This results in the following plot:

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM