[英]How do I create a bar plot in R where each bar is the percentage of a categorical var for a group?
Say I have data in the following format: 说我有以下格式的数据:
categoricalVar, numericVar, responseVar
Foo, 1, TRUE
Bar, 0, TRUE
Baz, 2, FALSE
...
...
... MUCH MUCH MORE
I want to create a bar plot where the X axis would be the 3 different types of categoricalVar
, and Y axis would the percentage of them that turned out to be TRUE
. 我想创建一个柱状图,其中X轴将是3种不同类型的
categoricalVar
,和Y轴将他们的发现竟然百分比是TRUE
。 A table would work too, like this. 像这样的表也可以工作。
Foo, Bar, Baz
respPct 0.4, 0.6, 0.9
So out of all the Foo
s, the percentage of TRUE
was 0.4. 因此,在所有
Foo
, TRUE
的百分比为0.4。
The same thing for numericVar
would be nice. 对于
numericVar
来说,同样的事情会很好。
0, 1, 2, ....
respPct 0.1, 0.2, 0.2
Although I think it makes sense to group the numericVar together, as follows: 尽管我认为将numericVar分组在一起是有意义的,如下所示:
0-5, 5-10, 10-15, ....
respPct 0.2, 0.3, 0.6
Can someone point me in the right direction? 有人可以指出我正确的方向吗?
First you have to transform your numericVar
into a categorial variable. 首先,您必须将您的
numericVar
转换为类别变量。 But let's first create some example data: 但是,让我们首先创建一些示例数据:
set.seed(2)
df <- data.frame(catVar = rep(c("foo","bar","saz"),each=10),
respVar = c(sample(c(TRUE,TRUE,TRUE,FALSE,TRUE), 10, replace =TRUE),
sample(c(FALSE,TRUE,TRUE,FALSE,TRUE), 10, replace =TRUE),
sample(c(FALSE,FALSE,TRUE,FALSE,TRUE), 10, replace =TRUE)),
numVar = sample(0:15, 30, replace =TRUE))
1: create a categorical variable for numVar
with: 1:使用以下方法为
numVar
创建分类变量:
df$catNum <- cut(df$numVar, breaks = c(-Inf,5,10,Inf), labels = c("0-5", "5-10", "10-15"))
2: aggregate the data with: 2:通过以下方式汇总数据:
df2 <- aggregate(respVar ~ catVar, df, FUN = function(x) sum(x)/length(x))
df3 <- data.frame(table(df$catNum)/30)
3: create some plots with: 3:使用以下方法创建一些图:
ggplot(df2, aes(x=catVar, y=respVar)) +
geom_bar(stat="identity")
ggplot(df3, aes(x=Var1, y=Freq)) +
geom_bar(stat="identity")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.