简体   繁体   中英

Order categorical data in a stacked bar plot with ggplot2

I have a matrix with the following entries:

structure(list(hhDomMil = c("HED", "ETB", "HED", "ETB", "PER", 
"BUM", "EXP", "TRA", "TRA", "PMA", "MAT", "MAT", "KON", "ETB", 
"PMA", "PMA", "HED", "BUM", "BUM", "HED", "PMA", "PMA", "HED", 
"TRA", "BUM", "EXP", "BUM", "PMA", "ETB", "MAT", "ETB", "ETB", 
"KON", "MAT", "TRA", "BUM", "BUM", "TRA", "TRA", "PMA", "PMA", 
"PMA", "MAT", "ETB", "TRA", "BUM", "TRA", "MAT", "BUM", "ETB", 
"TRA", "TRA", "BUM", "KON", "ETB", "ETB", "ETB", "BUM", "KON", 
"ETB", "ETB", "PMA", "TRA", "PER", "PER", "MAT", "HED", "KON", 
"TRA", "TRA", "TRA", "EXP", "TRA", "BUM", "MAT", "MAT", "TRA", 
"PMA", "HED", "PER", "TRA", "PER", "EXP", "PER", "BUM", "KON", 
"BUM", "ETB", "ETB", "TRA", "PER", "ETB", "KON", "KON", "BUM", 
"ETB", "BUM", "MAT", "BUM", "KON", "KON", "ETB", "MAT", "KON", 
"PER", "ETB", "ETB", "KON", "PMA", "PER", "HED", "HED", "PMA", 
"MAT", "PMA", "PER", "PMA", "TRA", "TRA", "MAT", "BUM", "BUM", 
"KON", "ETB", "ETB", "ETB", "PMA", "TRA", "TRA", "PMA", "PER", 
"KON", "PER", "BUM", "KON", "ETB", "ETB", "BUM", "TRA", "ETB", 
"PMA", "HED", "MAT", "TRA", "BUM", "PMA", "BUM", "ETB", "TRA", 
"TRA", "TRA", "PER", "EXP", "HED", "BUM", "EXP", "HED", "BUM", 
"MAT", "DDR", "BUM", "MAT", "KON", "HED", "HED", "TRA", "BUM", 
"PMA", "PMA", "PMA", "KON", "KON", "MAT", "ETB", "MAT", "TRA", 
"MAT", "ETB", "ETB", "TRA", "MAT", "ETB", "TRA", "HED", "BUM", 
"MAT", "TRA", "PMA", "BUM", "BUM", "EXP", "ETB", "EXP", "EXP", 
"MAT", "TRA", "KON", "BUM", "BUM", "HED"), kclust = c(1L, 2L, 
15L, 4L, 5L, 6L, 5L, 7L, 8L, 5L, 6L, 5L, 11L, 6L, 5L, 1L, 9L, 
10L, 2L, 1L, 9L, 8L, 4L, 11L, 14L, 5L, 8L, 11L, 12L, 5L, 5L, 
14L, 15L, 2L, 10L, 6L, 8L, 4L, 6L, 8L, 14L, 14L, 16L, 10L, 5L, 
1L, 12L, 17L, 12L, 16L, 16L, 5L, 10L, 14L, 8L, 19L, 5L, 4L, 4L, 
14L, 2L, 14L, 9L, 7L, 1L, 14L, 4L, 15L, 18L, 16L, 9L, 14L, 6L, 
14L, 12L, 11L, 4L, 7L, 8L, 12L, 9L, 16L, 2L, 6L, 15L, 1L, 1L, 
3L, 14L, 5L, 5L, 9L, 14L, 6L, 5L, 14L, 15L, 2L, 14L, 2L, 1L, 
8L, 5L, 10L, 1L, 1L, 16L, 5L, 2L, 9L, 9L, 1L, 12L, 10L, 1L, 4L, 
1L, 9L, 8L, 8L, 5L, 10L, 1L, 10L, 2L, 6L, 15L, 2L, 2L, 10L, 5L, 
6L, 10L, 19L, 19L, 6L, 5L, 6L, 7L, 7L, 8L, 5L, 16L, 5L, 6L, 6L, 
1L, 10L, 12L, 4L, 7L, 19L, 7L, 8L, 16L, 10L, 5L, 16L, 12L, 7L, 
7L, 19L, 4L, 6L, 1L, 15L, 7L, 8L, 16L, 4L, 10L, 15L, 11L, 10L, 
1L, 10L, 17L, 1L, 2L, 1L, 14L, 8L, 8L, 14L, 10L, 8L, 6L, 6L, 
8L, 5L, 7L, 5L, 1L, 5L, 7L, 9L, 2L, 1L, 9L, 14L), order = c(9, 
1, 9, 1, 3, 7, 10, 5, 5, 2, 8, 8, 4, 1, 2, 2, 9, 7, 7, 9, 2, 
2, 9, 5, 7, 10, 7, 2, 1, 8, 1, 1, 4, 8, 5, 7, 7, 5, 5, 2, 2, 
2, 8, 1, 5, 7, 5, 8, 7, 1, 5, 5, 7, 4, 1, 1, 1, 7, 4, 1, 1, 2, 
5, 3, 3, 8, 9, 4, 5, 5, 5, 10, 5, 7, 8, 8, 5, 2, 9, 3, 5, 3, 
10, 3, 7, 4, 7, 1, 1, 5, 3, 1, 4, 4, 7, 1, 7, 8, 7, 4, 4, 1, 
8, 4, 3, 1, 1, 4, 2, 3, 9, 9, 2, 8, 2, 3, 2, 5, 5, 8, 7, 7, 4, 
1, 1, 1, 2, 5, 5, 2, 3, 4, 3, 7, 4, 1, 1, 7, 5, 1, 2, 9, 8, 5, 
7, 2, 7, 1, 5, 5, 5, 3, 10, 9, 7, 10, 9, 7, 8, 6, 7, 8, 4, 9, 
9, 5, 7, 2, 2, 2, 4, 4, 8, 1, 8, 5, 8, 1, 1, 5, 8, 1, 5, 9, 7, 
8, 5, 2, 7, 7, 10, 1, 10, 10, 8, 5, 4, 7, 7, 9)), .Names = c("hhDomMil", 
"kclust", "order"), row.names = c(NA, 200L), class = "data.frame")

I want to create a stacked bar plot like this one 条形图 .

The only problem is, that I would like to have the order of the stacks to fit this (ETB,PMA,PER,KON,TRA,DDR,BUM,MAT,HED,EXP) - the order numbers in the matrix and I have also some aesthetic problems. I searched for a solution here but none of the ordering suggestions worked for me... :-\\

  1. How do I plot such a ordered plot?
  2. How do I set up x so that each bar is "on" one number?
  3. How do I seperate the bars - here I tried that with a white border...?
  4. How do I print all kclust numbers in x?

Thanks a lot for your help! Dominik


Here is the code I used to draw my plot:

mycols <- c('#FFFD00', '#97CB00', '#3168FF', '#FF0200', '#FB02FE', \
'#CCFCCC', '#FE9900', '#98CBF8', '#00CCFF', '#00FD03') # Set milieu colors

ggplot(MilDis) +
 geom_bar(aes(kclust, fill=factor(hhDomMil), \
 colour=mycols), position='fill', binwidth=1, colour='white') +
 scale_fill_manual(values = mycols)


That's how I did it now:

    mycols <- c('#3168FF', '#00CCFF', '#98CBF8', '#CCFCCC', '#00FD03',\
   '#97CB00', '#FFFD00', '#FE9900', '#FB02FE', '#FF0200') # Set milieu colors

    ggplot(MilDis) +
      geom_bar(aes(factor(kclust), fill=reorder(hhDomMil,order)),\
      position='fill') +
      scale_fill_manual(values = mycols)

With this result:


Thank you all for your help!

The answer to this is easily solved by getting your data formatted correctly before passing it to ggplot() . The key is to explicitly set the levels of the hhDomMil factor. Assuming your data are in dat :

dat <- transform(dat, hhDomMil = factor(hhDomMil,
                                        levels = c("ETB", "PMA", "PER", "KON",
                                                   "TRA", "DDR", "BUM", "MAT",
                                                   "HED", "EXP")))

That fixes hhDomMil as a factor in place inside dat , and sets the levels to be in the order you wanted:

> head(dat$hhDomMil)

Notice what is happing when R coerces hhDomMil to a factor:

> head(factor(as.character(dat$hhDomMil)))

The default is to sort the levels alphabetically, which is why the plot is coming out as you show.

The best advice I can give, is to get your data correctly formatted first and only then try to plot it - don't rely on automatic or on-the-fly conversion to get this right for you; inevitably it won't be what you want.

I see that you have an order column in your data frame which I gather is your order. Hence you can simply do.

p0 = qplot(factor(kclust), fill = reorder(hhDomMil, order), position = 'fill', 
       data = df1)

Here are the elements of this code that take care of your questions

  1. How do I plot such a ordered plot? reorder
  2. How do I set up x so that each bar is "on" one number? factor(kclust)
  3. How do I seperate the bars?
  4. How do I print all kclust numbers in x? factor(kclust)

I remember from a previous question of yours that the hhDomMil corresponded to different groups, and I suspect your ordering follows the grouping. In that case, you might want to use that information to choose a color palette that makes it simpler to follow the graph. Here is one way to do it.

mycols = c(brewer.pal(3, 'Oranges'), brewer.pal(3, 'Greens'), 
           brewer.pal(2, 'Blues'), brewer.pal(2, 'PuRd'))

p0 + scale_fill_manual(values = mycols)


If you relevel your hhDomMil as a factor like this:

o<-c("ETB" "PMA" "PER" "KON" "TRA" "DDR" "BUM" "MAT" "HED" "EXP")

then your plot will be in the order you like:

ggplot(d,(aes(x=kclust, fill=hh))) +geom_bar(position="fill")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM