简体   繁体   中英

how to optimize the graph ggplot with geom_bar

I am a beginner under R, I created a graph that superimposes the temperatures to the precipitations, under ggplot with geom_bar as an option. However, I use the option position = position_nudge (x = 0.4) , so that the graphics are not on top of each other. When I use this option, it completely changes the way of calculating.

For example, as you will see below, I would like to have on the barplots on the right dates, a barplot until 31/30. Do you know how to solve this problem? Thank you in advance for your precious help.

Below are my table and my code.

SOUNAME year_month  pre_type    pre_value   tem_type    tem_value   nb_species
WATERFORD (TYCOR)   2014-04 NONE    14  V_COLD  0   NA
WATERFORD (TYCOR)   2014-04 HEAVY   3   COLD    30  8
WATERFORD (TYCOR)   2014-04 LIGHT   7   HOT 0   NA
WATERFORD (TYCOR)   2014-04 MEDIUM  6   MEDIUM  0   NA
WATERFORD (TYCOR)   2014-05 NONE    15  V_COLD  0   NA
WATERFORD (TYCOR)   2014-05 HEAVY   3   COLD    31  17
WATERFORD (TYCOR)   2014-05 LIGHT   10  HOT 0   NA
WATERFORD (TYCOR)   2014-05 MEDIUM  3   MEDIUM  0   NA
WATERFORD (TYCOR)   2014-06 NONE    17  V_COLD  0   NA
WATERFORD (TYCOR)   2014-06 HEAVY   2   COLD    17  NA
WATERFORD (TYCOR)   2014-06 LIGHT   9   HOT 13  NA
WATERFORD (TYCOR)   2014-06 MEDIUM  2   MEDIUM  0   NA

ggplot(data = complet_w, 
       aes(x = complet_w$year_month, 
           y = complet_w$pre_value, 
           fill = complet_w$pre_type, 
           width=0.5), 
       stat = "identity") + 
  geom_bar(stat = "identity") + 
  xlab("date") + 
  ylab ("Number of days of precipitation") + 
  ggtitle("Precipitation per month") + 
  labs(fill = "Frequency") +
  geom_bar(data=complet_w,
           aes(x=complet_w$year_month, 
               y=complet_w$tem_value, 
               fill=complet_w$tem_type, 
               width=0.1), 
           stat = "identity", 
           position = position_nudge(x=0.4)) + 
  xlab("date") + 
  ylab("Number of days of temperature") + 
  ggtitle("Temperature per month") + 
  labs(fill = "Frequency") 

Below is my result. I would like all bars to be 30-31. Is it possible?

结果

You can replace the position_nudge() to position_stack() . But to get the 2 graphs in one plot you would have to change the aesthetics of the geom_bar() separately.

Use your code like this:

#Change xval.yval and xlab to your requirement

ggplot(Dataname)+geom_bar(stat="identity",aes(x=as.numeric(xval)-0.25,y=yval))+
geom_bar(stat="identity",aes(x=as.numeric(xval)+0.25,y=yval))+
scale_x_discrete(labels= xlabs)   #because you have renamed them using aes()

Hope this helps!

I remember your previous question. I think you are looking for something like this:

library(dplyr)
library(ggplot2)
library(reshape2)      #To use melt
View(TEMP_PREC_BIRR)
View(TEMP_BIRR)
write.csv(TEMP_PREC_BIRR,"Data1.csv")
write.csv(TEMP_BIRR,"Data2.csv")
Data1=melt(TEMP_PREC_BIRR,id.vars="year_month")
Data2=melt(TEMP_BIRR,id.vars="year_month")
Data=rbind(Data1,Data2)
Data=Data[!(Data$variable=="SOUNAME"),]
View(Data)
Data=read.csv("Data1.csv")
View(Data)
ggplot(Data,aes(x=year_month,y=Data$precipitation_value_,fill=Data$precipitation_type))+
  geom_col()

ggplot(data = TEMP_PREC_BIRR, aes(x = TEMP_PREC_BIRR$year_month, 
                                  y = TEMP_PREC_BIRR$precipitation_value, 
                                  fill = TEMP_PREC_BIRR$precipitation_type,width=0.2)) + 
  geom_bar(aes(x = as.numeric(year_month)+0.25, 
               y = TEMP_PREC_BIRR$precipitation_value, 
               fill = TEMP_PREC_BIRR$precipitation_type),
           stat = "identity",position = position_stack()) + 
  xlab("date") + ylab ("Number of days of precipitation") + 
  ggtitle("Precipitation per month - BIRR") + labs(fill = "Frequency")+
  geom_bar(data=TEMP_BIRR,aes(x=as.numeric(TEMP_BIRR$year_month)-0.25,
                                 y=TEMP_BIRR$temperature_value,
                                 fill=TEMP_BIRR$temperature_type), stat = "identity",position = position_stack()) +
  xlab("date") + ylab("Number of days of temperature") + 
  ggtitle("Temperature per month - BIRR") + labs(fill = "Frequency")+
  theme(panel.background=element_blank())

输出图

There's a few things going on that don't quite fit with ggplot 's setup:

  1. You've set labels and the title multiple times. If you have a plot, then say ggtitle("Title 1") + ggtitle("Title 2") , only Title 2 will be displayed, because the last call overrides any previous ones. Same with your calls of xlab , etc.
  2. You have multiple instances of geom_bar , but they're doing essentially the same thing. This generally signals that your data should be in a long format so you can map some variable to an aesthetic that you aren't currently. I'd respectfully disagree with previous answers that keep the multiple calls to geom_bar and encourage you to instead reshape your data in line with the ggplot paradigm.
  3. Only in rare cases should you refer back to the name of your data frame while using ggplot , other than data = . It can cause weird errors. So x = TEMP_PREC_BIRR$year_month should just be x = year_month .

Getting your data into a long format made me realize a couple things: you're currently comparing days of certain temperatures with days of certain precipitation levels. fill is then receiving both heavy precipitation and cool temperature, for example, but these aren't really comparable. You also have "medium" as a potential temperature and level of precipitation, which means they'll just get lumped together as the same thing in fill . You'll want to adjust those type labels accordingly.

Having two types of measures—precipitation and temperature—that you want to display together but which are not directly comparable is a good use case for faceting, which I do in this first example.

I'm getting the data into a long shape by gather ing twice, once to make a column of measure types (pre or tem), and once to get a column of values. Because of the levels issue with types duplicated across measure types, I'm using interaction to make types like "MEDIUM.pre" differentiable from "MEDIUM.tem". This is still less than ideal, because you're giving fill colors on the same scale but with different types of measures.

Then I call facet_wrap(~ measure) to show the two measure types as being related but not the same. If, in your larger dataset you have multiple locations, you could do facet_grid(SOUNAME ~ measure) .

library(tidyverse)

df <- structure(list(
  SOUNAME = c("WATERFORD (TYCOR)", "WATERFORD (TYCOR)","WATERFORD (TYCOR)", "WATERFORD (TYCOR)", "WATERFORD (TYCOR)","WATERFORD (TYCOR)", "WATERFORD (TYCOR)", "WATERFORD (TYCOR)","WATERFORD (TYCOR)", "WATERFORD (TYCOR)", "WATERFORD (TYCOR)","WATERFORD (TYCOR)"), 
  year_month = c("2014-04", "2014-04", "2014-04","2014-04", "2014-05", "2014-05", "2014-05", "2014-05", "2014-06","2014-06", "2014-06", "2014-06"), 
  pre_type = c("NONE", "HEAVY","LIGHT", "MEDIUM", "NONE", "HEAVY", "LIGHT", "MEDIUM", "NONE","HEAVY", "LIGHT", "MEDIUM"), 
  pre_value = c(14L, 3L, 7L, 6L, 15L,3L, 10L, 3L, 17L, 2L, 9L, 2L), 
  tem_type = c("V_COLD", "COLD","HOT", "MEDIUM", "V_COLD", "COLD", "HOT", "MEDIUM", "V_COLD","COLD", "HOT", "MEDIUM"), 
  tem_value = c(0L, 30L, 0L, 0L, 0L,31L, 0L, 0L, 0L, 17L, 13L, 0L), 
  nb_species = c(NA, 8L, NA, NA,NA, 17L, NA, NA, NA, NA, NA, NA)), 
  .Names = c("SOUNAME", "year_month","pre_type", "pre_value", "tem_type", "tem_value", "nb_species"), 
  class = c("tbl_df", "tbl", "data.frame"), 
  row.names = c(NA,-12L))

df %>%
  select(-nb_species) %>%
  gather(key = measure, value = type, pre_type, tem_type) %>%
  gather(key = pre_or_tem2, value = value, pre_value, tem_value) %>%
  select(-pre_or_tem2) %>%
  mutate(measure = str_extract(measure, "^[a-z]+")) %>%
  mutate(type = interaction(type, measure)) %>%
  ggplot(aes(x = year_month, y = value, fill = type)) +
    geom_col(position = "stack") +
    scale_fill_brewer(palette = "Set2") +
    facet_wrap(~ measure)

In the second example I do some editorializing. I noticed that there are only a limited number of combinations of temperature and precipitation, so I instead set up a column of the interactions between the two; grouping by this, plus year_month , lets you sum up the number of days with a combination of precipitation level and temperature. I mapped this interaction column to fill, and dropped the faceting. Again, if you have observations at multiple locations, you could make use of facet_wrap(~ SOUNAME) .

df %>%
  select(-nb_species) %>%
  gather(key = measure, value = value, pre_value, tem_value) %>%
  mutate(measure = str_extract(measure, "^[a-z]+")) %>%
  mutate(pre_type = paste0(pre_type, "pre")) %>%
  mutate(tem_type = paste0(tem_type, "tem")) %>%
  select(-measure) %>%
  mutate(type = interaction(pre_type, tem_type)) %>%
  group_by(SOUNAME, year_month, type) %>%
  summarise(value = sum(value)) %>%
  ungroup() %>%
  ggplot(aes(x = year_month, y = value, fill = type)) +
    geom_col() +
    scale_fill_brewer(palette = "Set2")

Created on 2018-05-19 by the reprex package (v0.2.0).

Final note: I dropped the nb_species column just to simplify the data I was working with since it didn't appear in the plots; I used geom_col() , which is equivalent to geom_bar(stat = "identity") ; and I used a Color Brewer palette because it was easier to differentiate with lots of colors. Feel free to ignore all of these steps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM