简体   繁体   中英

R- Bar plot with continuous x and y

If this question has already been answered, please link as I have not been able to locate a similar question. I have referred to R bar plot with 3 variables , Bar plot with multiple variables in R , ggplot with 2 y axes on each side and different scales , Bar Plot with 2 y axes and same x axis in R language [duplicate] , Bar Plot with 2 Y axes and same X- axis .

I have a dataset that includes species, observed value, expected value, and a standardized value from the observed and expected.

data <- structure(list(Species = c("BABO_BW", "BABO_BW", "BABO_BW", "BABO_RC", 
"BABO_RC", "BABO_RC", "BABO_SKS", "BABO_SKS", "BABO_SKS", "BABO_MANG", 
"BABO_MANG", "BABO_MANG", "BW_RC", "BW_RC", "BW_RC", "BW_SKS", 
"BW_SKS", "BW_SKS", "BW_MANG", "BW_MANG", "BW_MANG", "RC_SKS", 
"RC_SKS", "RC_SKS", "RC_MANG", "RC_MANG", "RC_MANG", "SKS_MANG", 
"SKS_MANG", "SKS_MANG"), variable = c("obs.C-score", "exp.C-score", 
"SES_Cscore", "obs.C-score", "exp.C-score", "SES_Cscore", "obs.C-score", 
"exp.C-score", "SES_Cscore", "obs.C-score", "exp.C-score", "SES_Cscore", 
"obs.C-score", "exp.C-score", "SES_Cscore", "obs.C-score", "exp.C-score", 
"SES_Cscore", "obs.C-score", "exp.C-score", "SES_Cscore", "obs.C-score", 
"exp.C-score", "SES_Cscore", "obs.C-score", "exp.C-score", "SES_Cscore", 
"obs.C-score", "exp.C-score", "SES_Cscore"), value = c(328680, 
276507, 6.73358774036271, 408360, 345488, 5.31345024375997, 285090, 
254670, 4.35376633657727, 12474, 12190, 1.24624427424057, 1450800, 
1809738, -11.0195450589776, 1507488, 1361088, 6.15672144449049, 
62706, 65780, -0.495728742814285, 1790156, 1700165, 2.70409191051284, 
45701, 86301, -4.71151949799025, 42240, 62745, -4.52203636797869
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-30L))

Sample Output

    Species    Variable       Value
1   BABO_BW    obs.C-score    328680.0000000
2   BABO_BW    exp.C-score    276507.0000000
3   BABO_BW    SES_Cscore     6.7335877
4   BABO_MANG  obs.C-score    12474.0000000
5   BABO_MANG  exp.C-score    12190.0000000
6   BABO_MANG  SES_Cscore     1.2462443
7   BABO_RC    obs.C-score    408360.0000000
8   BABO_RC    exp.C-score    345488.0000000
9   BABO_RC    SES_Cscore     5.3134502
10  BABO_SKS   obs.C-score    285090.0000000

I am trying to put the SES_Cscore on the x-axis and have the obs.C-score and exp.C-score as bars. The species column groupings the C-scores, so I would like to include those in the x-axis as well.

I have been able to plot the species at the y and the other variables as bar graphs.

ggplot(data,aes(x = Species,y = value)) + 
    geom_bar(aes(fill = variable),stat = "identity",position = "dodge")

图表不正确

I would like to have the continuous variable of SES_Cscore as well on the x-axis. Is there a way to do this?

Thank you in advance and have a lovely day!

This could be done by reshaping the data slightly so that SES_Score is recorded as a variable with one value per Species, and not as a variable to be mapped to bar height for each observation. I do that here by reshaping wide (so that the three variables each get their own columns), and then reshaping long again but only for the variables we want to map to y.

library(tidyverse)
data %>%
  pivot_wider(names_from = variable, values_from = value) %>%
  pivot_longer(2:3) %>%
  mutate(Species2 = paste(Species, round(SES_Cscore,digits = 2), sep = "\n") %>%
           fct_reorder(SES_Cscore)) -> data2

data2
## A tibble: 20 × 5
#   Species   SES_Cscore name          value Species2         
#   <chr>          <dbl> <chr>         <dbl> <fct>            
# 1 BABO_BW        6.73  obs.C-score  328680 "BABO_BW\n6.73"  
# 2 BABO_BW        6.73  exp.C-score  276507 "BABO_BW\n6.73"  
# 3 BABO_RC        5.31  obs.C-score  408360 "BABO_RC\n5.31"  
# 4 BABO_RC        5.31  exp.C-score  345488 "BABO_RC\n5.31"  
# 5 BABO_SKS       4.35  obs.C-score  285090 "BABO_SKS\n4.35" 
# etc.

We could alternately achieve the reshaping differently in a way that might be more performant for large data, by making it into a join between the observations we want to map to y, and the observations we want to use for each species' x position:

left_join(data %>% filter(variable != "SES_Cscore"),
          data %>% filter(variable == "SES_Cscore") %>%
            transmute(Species, x_val = value,
                      Species_label = paste(Species, sprintf(value, 
                        fmt = "%#.2f"), sep = "\n") %>% fct_reorder(value))) 

Once reshaped, it's more straightforward to get a plot that is ordered by the SES_Cscore for each species:

ggplot(data2, aes(Species2, value, fill = name)) +
  geom_col(position = "dodge")

在此处输入图像描述


If you want to plot with a continuous x axis related to SES_Cscore, you may run into some graphic design challenges, since the data might be bunched up in some cases. Note how the default bar width gets quite squished so that ggplot can keep the 2nd and 3rd Species bars from overlapping.

This approach also takes a little more work, since ggplot's axes work for either discrete (categorical) data, or continuous data, and there isn't a default designed to manage a combination, with categorical data that is mapped continuously. So you'd have to revert to some sort of geom_text to make manual labels, and some customization if you want them to look more like normal axes labels.

ggplot(data2, aes(SES_Cscore, value, fill = name)) +
  geom_col(position = "dodge") +
  ggrepel::geom_text_repel(aes(y = 0, label = Species), 
                           angle = 90, direction = "x", hjust = 0, lineheight = 0.8, size = 3,
                           data = data2 %>% distinct(Species, .keep_all = TRUE))

在此处输入图像描述

Up front, scaling the data and using a second axis can visually misrepresent the data: it's not hard to look at this plot hastily and infer that the blue bars' values mean the same thing as the red/green bars.

Having said that, try this:

library(ggplot2)
library(dplyr)
fac <- 50000
mycolors <- c("obs.C-score" = "red", "exp.C-score" = "green", "SES_Cscore" = "blue")
data %>%
  mutate(value = value * ifelse(variable == "SES_Cscore", fac, 1)) %>%
  ggplot(aes(x = Species, y = value)) +
  geom_bar(aes(fill = variable), stat = "identity", position = "dodge") +
  scale_y_continuous(
    sec.axis = sec_axis(name = "SES_Cscore", ~ . / fac),
    breaks = ~ scales::extended_breaks()(pmax(0, .))
  ) +
  scale_color_manual(values = mycolors) +
  theme(
    axis.title.y.right = element_text(color = mycolors["SES_Cscore"]),
    axis.text.y.right = element_text(color = mycolors["SES_Cscore"]),
    axis.ticks.y.right = element_line(color = mycolors["SES_Cscore"])
  )

带第二轴的ggplot

I'm using blue colors on the second (right) axis to try to visually pair it with the blue bars. I also took the liberty of keeping the primary (left) axis at 0 or more based on my inference of the data; it is not required at all. Also, I could have omitted scale_color_manual(.) and just assume that out use of element_text(color="blue") is going to be correct; that would fail if/when your data changes with either fewer or more levels within variable , so I control them manually... and I try to assign everything on the second axis the right color:-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM