简体   繁体   中英

ggplot2 stacked bars with categorial and numerical values while using position dodge2

I have a dataframe with certain factors and numerical values that I want to plot.

The data that I have looks exactly like this (can't reproduce a sample sorry):

    library(ggplot2)

df=structure(list(StartPos = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 
       6, 6, 6, 6, 6, 6, 6, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 
       8, 8, 8, 8), Direction = c("Left", "Left", "Left", "Left", "Left", 
       "Left", "Left", "Left", "Left", "Right", "Right", "Right", "Right", 
       "Right", "Right", "Right", "Right", "Right", "Left", "Left", 
       "Left", "Left", "Left", "Left", "Left", "Left", "Left", "Right", 
       "Right", "Right", "Right", "Right", "Right", "Right", "Right", 
       "Right"), Velocity = c(36, 36, 36, 36, 36, 36, 36, 36, 36, -36, 
       -36, -36, -36, -36, -36, -36, -36, -36, 36, 36, 36, 36, 36, 36, 
       36, 36, 36, -36, -36, -36, -36, -36, -36, -36, -36, -36), Duration = c(0.2, 
       0.2, 0.2, 0.5, 0.5, 0.5, 1, 1, 1, 0.2, 0.2, 0.2, 0.5, 0.5, 0.5, 
       1, 1, 1, 0.2, 0.2, 0.2, 0.5, 0.5, 0.5, 1, 1, 1, 0.2, 0.2, 0.2, 
       0.5, 0.5, 0.5, 1, 1, 1), n_runs = c(12, 12, 12, 12, 12, 12, 12, 
       12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 
       12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12), Response = c("H", 
       "M", "W", "H", "M", "W", "H", "M", "W", "H", "M", "W", "H", "M", 
       "W", "H", "M", "W", "H", "M", "W", "H", "M", "W", "H", "M", "W", 
       "H", "M", "W", "H", "M", "W", "H", "M", "W"), n_hits = c(8, 1, 
       3, 10, 1, 1, 10, 2, 0, 10, 2, 0, 11, 1, 0, 10, 2, 0, 8, 3, 1, 
       9, 0, 3, 9, 3, 0, 10, 2, 0, 10, 2, 0, 12, 0, 0), p_test = c(0.66666667, 
       0.08333333, 0.25, 0.83333333, 0.08333333, 0.08333333, 0.83333333, 
       0.16666667, 0, 0.83333333, 0.16666667, 0, 0.91666667, 0.08333333, 
       0, 0.83333333, 0.16666667, 0, 0.66666667, 0.25, 0.08333333, 0.75, 
       0, 0.25, 0.75, 0.25, 0, 0.83333333, 0.16666667, 0, 0.83333333, 
       0.16666667, 0, 1, 0, 0)), class = "data.frame", row.names = c(NA, 
       -36L))

The goal that I want to achieve is, plotting the StartPos and their directions, durations and percentages (H, M, W ; Hit, Miss, Wrong) in a certain way - so that the percentages of each combination are stacked on one bar. It's kind of difficult for me explain, so I'll just show you what I already tried:

df36= ggplot() +
  geom_bar(data=df, mapping=aes(x=as.factor(StartPos), fill=Duration, 
  y=p_test),stat="identity", position="dodge2") + 
  labs(x="StartPos", y="Hitrate") + ggtitle("Velocity 36°") + theme_bw() +
  scale_fill_gradient(low="red", high="green")


df36

The resulting plot looks like this:

在此处输入图片说明

It looks a bit crowded and confusing -but I'll get to the point. The red bars represent the duration of 0.2s, brown 0.5s and green 1.0s. The first bar of each color shows the percentage of Hits, the second one percentage of Miss and the last the percentage for Wrongs. There are also the Startpositions 6 and 8. The three colors on the left of the startPos 6 are Stimuli, that had the direction Left. The immediate three colors on the right of the StartPos 6 are Stimuli with the direction right - the exact same goes for the StartPos 8.

That is basically the stuff that I need - but it doesn't look good. The thing that I want to achieve is, "stacking" the percentages of H,M and W - so that there are no "little" bars beside each" Hit percentage and mark them somehow to distinguish the percentage of H from M and so on. If that's not possible I'd like to paint/color the response in different colors and have them shown in a legend . For example ~ M in black and W in yellow.

Is there any way to do this? I'm kinda lost now. Thanks in advance!

This isn't exactly what you want (it doesn't use position="dodge2" ), but I think produces an output that is more clear than plotting all that data on a single x-axis. This way, the variables are grouped for easier labelling and, of course, interpretation, which should be the main objective of a plot. We stack the bars with fill = Response , then use facet_grid to split the groups up. As an aside, I don't think scale_fill_gradient should be used for categorical variables.

# set facet labels
facet_labels <- as_labeller(c(`Left` = "Direction = Left", `Right` = "Direction = Right", `6` = "StartPos = 6", `8` = "StartPos = 8"))

ggplot(df) + 
    geom_col(aes(x = as.factor(Duration), y = p_test, fill = Response)) + 
    facet_grid(Direction ~ StartPos, switch = "y", labeller = facet_labels) +
    xlab("Duration") +
    ylab("Response proportion")

输出

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM