[英]How to count and group multiple columns in R dataframe?
really basic question... I have a dataframe like the one below, where the numbers indicate a score:非常基本的问题...我有一个 dataframe,如下所示,其中数字表示分数:
df<-data.frame(A=c(1,2,1,1,3,3,2,2),B=c(2,2,2,3,2,3,3,1),C=c(1,1,1,1,1,2,2,3))
And I would like to change it to this format to plot it in a stacked bar chart:我想在堆叠条形图中将其更改为这种格式,即 plot:
I know how to do it in a very roundabout and probably overly complicated way, and any suggestions on a more "streamlined" way to do it would be very welcome!我知道如何以一种非常迂回且可能过于复杂的方式来做到这一点,并且非常欢迎任何关于更“简化”的方式来做到这一点的建议! Thanks in advance!提前致谢!
library(tidyverse)
df %>%
pivot_longer(everything(), names_to = "Score") %>%
count(Score, value, name = "Freq")
# A tibble: 9 × 3
Score value Freq
<chr> <dbl> <int>
1 A 1 3
2 A 2 3
3 A 3 2
4 B 1 1
5 B 2 4
6 B 3 3
7 C 1 5
8 C 2 2
9 C 3 1
The dplyr
solutions are likely more scalable, but an alternative base R approach: use do.call
along with lapply
and table
then put it all in a data.frame: dplyr
解决方案可能更具可扩展性,但另一种基本 R 方法:将do.call
与lapply
和table
一起使用,然后将其全部放入 data.frame 中:
data.frame(Name = rep(c("A", "B", "C"), each = 3),
Score = rep(1:3, each = 3),
Frequency = do.call(c, lapply(df[], table)))
# Name Score Frequency
# A.1 A 1 3
# A.2 A 1 3
# A.3 A 1 2
# B.1 B 2 1
# B.2 B 2 4
# B.3 B 2 3
# C.1 C 3 5
# C.2 C 3 2
# C.3 C 3 1
Using base R
使用base R
as.data.frame(table(stack(df)[2:1]))
ind values Freq
1 A 1 3
2 B 1 1
3 C 1 5
4 A 2 3
5 B 2 4
6 C 2 2
7 A 3 2
8 B 3 3
9 C 3 1
We can turn the data into long format and then calculate frequency我们可以把数据转成长格式,然后计算频率
df%>%
gather(Name,Score,A:C)%>%
group_by(Name,Score)%>%
summarise(Frequency=n())%>%
ungroup
Name Score Frequency
<chr> <dbl> <int>
1 A 1 3
2 A 2 3
3 A 3 2
4 B 1 1
5 B 2 4
6 B 3 3
7 C 1 5
8 C 2 2
9 C 3 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.