简体   繁体   中英

R bar plot with 3 variables

I have a dataframe that has multiple variables, and I would like to know how can I plot them like the plotting option in Excel .

Just a simple data example:

  > V1   V2  V3
    1    A    0
    1    A    0
    1    B    1
    1    B    0
    1    A    1
    2    A    0
    2    B    0
    2    A    0
    2    A    0
    2    A    0

What I'd like to have, is an x axis with V1 , a y axis with all the count of V3 when V2 is A or B .

Can somebody please share some thoughts on how to do this? The barplot function doesn't seem capable because it can only work with 2*2 table?

Thank you.

Edit:

This plot is not generated by the data given though: 在此处输入图片说明

Consider the y axis as the percentage of V3 , the x axis of V1 and for each level of V2 a bar chart is created.

library( 'ggplot2' )
library( 'reshape2' )
df1 <- dcast( data = df1, formula = V1 ~ V2,  value.var = 'V3',  fun.aggregate = sum )  # get sum of V3 by grouping V1 and V2
df1 <- melt( data = df1, id.vars = 'V1')   # melt data
df1
#    V1 variable value
# 1  1        A     1
# 2  2        A     5
# 3  1        B     1
# 4  2        B     0  


ggplot(data = df1, aes( x = factor( V1 ), y = value, fill = variable ) ) +    # print bar chart
  geom_bar( stat = 'identity' )

在此处输入图片说明

using position = 'dodge

ggplot(data = df1, aes( x = factor( V1 ), y = value, fill = variable ) ) +    # print bar chart
  geom_bar( stat = 'identity', position = 'dodge' )

在此处输入图片说明

Data:

df1 <- read.table(text = 'V1   V2  V3
    1    A    0
                  1    A    0
                  1    B    1
                  1    B    0
                  1    A    1
                  2    A    0
                  2    B    0
                  2    A    0
                  2    A    5
                  2    A    0', header = TRUE, stringsAsFactors = FALSE )

First you need to get a summary dataframe that contains the values you want to plot.

df <- data.frame(V1 = rep(1:2,each=5), V2 = c("A","A","B", "B", "A", "A", "B","A", "A", "A"), 
                 V3 = c(0,0,1,0,1,0,0,0,0,0))

values <- aggregate(df$V3, list(V1 = df$V1, V2 = df$V2), sum)

#    V1 V2 V3
# 1  1  A  1
# 2  2  A  0
# 3  1  B  1
# 4  2  B  0

ggplot(values, aes(x = factor(V1), y = V3, fill = V2))+
                geom_bar(stat = "identity", width = 0.2)

一

OR, this if you don't want them to be stacked on top of each other. Adding some labels.

ggplot(values, aes(x = factor(V1), y = V3, fill = V2))+
                geom_bar(stat = "identity", width = 0.2, position = "dodge") +
                labs(list(x = "x", y = "count",fill = "group"))

二

EDIT

I tried to use ggplot directly on the dataframe without making a summary, and the results are the same.

## a little change in V3
df <- data.frame(V1 = rep(1:2,each=5), 
                 V2 = c("A","A","B", "B", "A", "A", "B","A", "A", "A"), 
                 V3 = c(2,0,1,2,1,3,3,8,1,0))
## plot df directly
ggplot(df, aes(factor(V1), V3, fill = V2)) + 
        geom_bar(stat = "identity", width = 0.2, position = "dodge") +
        labs(list(x = "x", y = "count",fill = "group"))

三

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM