简体   繁体   中英

ggplot2: geom_bar and position_dodge: not centered to x axis (factor)

I have two issue on this plot. I want to make the bars wider (and less spacing between the groups) and I want each group of bars to be centered to each x factor values.

I have a continuous variable on the y-axis and a factor value on the x-axis. I have three groups for each factor.

Here is an example of my issue with the Iris data:

d <- iris

ggplot(d) +
geom_col(aes(x=as.factor(Sepal.Length), y=Petal.Width, fill=as.factor(Species)),position = position_dodge(preserve = "single"), width=1) + 
  theme(axis.text.x = element_text(angle = 90).

This gets you something probably closer to what you're looking for (I assume d in your example is iris ):

ggplot(iris) +
    geom_col(
        aes(
            x=as.factor(Sepal.Length), y=Petal.Width, fill=Species
        ),
        position = position_dodge(0.5),
        width=0.5) + 
    theme(axis.text.x = element_text(angle = 90, vjust=0.5))

在此处输入图片说明

Now, for the explanation of what I changed and why:

Text Positioning on X-axis. You used element_text(angle=90) to change the direction of the text. This is correct, but it only changes the angle and not the positioning/alignments . By default, horizontal text is vertically aligned to be "at the top". If you run the code above and use vjust=1 in place of vjust=0.5 , you'll see it goes back to the way it appears for you, with the tick marks being aligned to the "top" of the value on the x axis text.

as.factor(Species) No need to declare Species a factor. Run str(iris) and you'll see that iris$Species is already a factor. Doesn't really change anything to the result except messes with the title of the legend.

Position_dodge width and width. This one is best explained by you messing with the values in the two terms position_dodge(0.5) and width=0.5 . Play with it yourself and you'll see what they each do, but here's the general explanation:

  • Total column width for each position on the x-axis is determined by width=0.5 that is the argument for geom_col() . So, for every Sepal.Length factor in this graph, it means that "0.5" is used as the total width of the column (or columns) that are in that space. "1.0" would mean "I want all columns to touch each other" and something like "0.2" means "I want skinny columns". "0" means... I don't want columns - give them a width of zero!

  • The width of each "sub-column" (each Species column for each Sepal.Length in this example) is controlled by the position_dodge(width=0.5) term. 0.5 represents "split this in half and have the columns touch each other exactly". Higher values will split them apart and lower values will squish them together, where 0 means they are on top of one another. Making the value really large, you get sub-columns running into neighboring columns...

Again - play around with those terms and you should get how they work together.

Maybe an another solution is to use position_dodge2 that will center each bar to the center of x values:

ggplot(iris, aes(x = as.factor(Sepal.Length), y = Petal.Width, fill = Species))+
  geom_col(position = position_dodge2(preserve = "single", width = 1))+
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM