简体   繁体   English

两个类别变量的R条形图

[英]R barplot of two categorical variables

I have a dataframe, i am interested in the relationship between two categorical variables Type and Location, Type has 5 levels and the Location has 20 levels. 我有一个数据框,我对两个类别变量Type和Location之间的关系感兴趣,Type具有5个级别,Location具有20个级别。

I want to plot the percentage of Types for each location. 我想绘制每个位置的类型百分比。 I wanted to know if there was a concise way of doing it using ggplot2 ? 我想知道是否有使用ggplot2的简洁方法?

In my case the variable in the x axis has 20 levels so i am also running into spacing issues, any help would be appreciated 就我而言,x轴上的变量有20个级别,因此我也遇到间距问题,我们将不胜感激

EDIT: A more concrete example: 编辑:一个更具体的例子:

df
   gender beverage
1  Female     coke
2    Male     bear
3    Male     coke
4  Female     bear
5    Male      tea
6    Male     bear
7  Female    water
8  Female      tea
9  Female     bear
10   Male      tea

I want to plot the gender wise percentage of each beverage, eg: There are 3 tea drinkers of which 2 are male and 1 is female so male % would be 66.67 and female percentage would be 33.33 So in the x axis corresponding to tea there should be two bars male with y = 66.67 and female with y = 33.33. 我想绘制每种饮料的性别比例,例如:有3个饮茶者,其中2个是男性,而1个是女性,因此男性百分比将是66.67,女性百分比将是33.33,所以在对应于茶的x轴上应该是y = 66.67和y = 33.33的两个柱形。

The easiest way is to pre-process, since we have to calculate the percentages separately by gender. 最简单的方法是进行预处理,因为我们必须按性别分别计算百分比。 I use complete to make sure we have the zero percent bars explicitly in the data.frame, otherwise ggplot will ignore that bar and widen the other gender's bar. 我使用complete来确保在data.frame中明确拥有零百分比柱,否则ggplot将忽略该柱并扩大其他性别的柱。

library(dplyr)
library(tidyr)
df2 <- df %>% 
  group_by(gender, beverage) %>% 
  tally() %>% 
  complete(beverage, fill = list(n = 0)) %>% 
  mutate(percentage = n / sum(n) * 100)

ggplot(df2, aes(beverage, percentage, fill = gender)) + 
  geom_bar(stat = 'identity', position = 'dodge') +
  theme_bw()

在此处输入图片说明

Or the other way around: 或反过来:

df3 <- df %>% 
  group_by(beverage, gender) %>% 
  tally() %>% 
  complete(gender, fill = list(n = 0)) %>% 
  mutate(percentage = n / sum(n) * 100)

ggplot(df3, aes(beverage, percentage, fill = gender)) + 
  geom_bar(stat = 'identity', position = 'dodge') +
  theme_bw()

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM