如何在R中的数据框中按列的每个因子级别查找行数？

Question

I have several different datasets with different number of factor variables and an output variable. 我有几个具有不同数量的因子变量和输出变量的不同数据集。 For each of these data-set I need to find number of rows of observations grouped by each factor level of the variables and further grouped by all variables (columns). 对于这些数据集中的每个数据集，我需要找到按行列的每个因子级别分组并进一步按所有变量（列）分组的观测行数。 I thought a for loop might do the trick but am struggling with it. 我以为for循环可能会解决问题，但是正在努力。 Could someone please help with this? 有人可以帮忙吗？

the data set looks something like this: 数据集看起来像这样：

enter image description here 在此处输入图片说明

and I want the ouput to be enter image description here 我希望输出在这里输入图像描述

I have tried 我努力了

for (i in 1:length(df)){
df %>% group_by(df[[i]]) %>%  summarise(n = length(i))%>%print()

} }

but this doesn't seem to be working 但这似乎不起作用

Answer 1

An option is to gather into 'long' format and then do the count 一种选择是gather为“长”格式，然后进行count

library(tidyverse)
gather(df1, Variable,  Factor_Level, var1:var3) %>%
     count(Variable, Factor_Level)

Answer 2

If you are ok with a list format you could stop after creating the list. 如果您可以使用列表格式，则可以在创建列表后停止。 However, this is a (somewhat complex) alternative to the gather method proposed by akrun: 然而，这是一个（有点复杂）替代gather由akrun提出的方法：

# Getting a vector of factor variables in dataset
factor_vars <- names(factor_vars)[sapply(mtcars, is.factor)]

# Creating list of frequency tables
freq_tables <- lapply(factor_vars, function(x) group_by_(mtcars, .dots = x) %>% tally())

freq_tables <- lapply(freq_tables, function(x) cbind(colnames(x)[1], x))
do.call(rbind, lapply(freq_tables, setNames, c("Factor", "Level", "Count")))

   Factor Level Count
1      vs     0    18
2      vs     1    14
3      am     0    19
4      am     1    13
5    gear     3    15
6    gear     4    12
7    gear     5     5
8    carb     1     7
9    carb     2    10
10   carb     3     3
11   carb     4    10
12   carb     6     1
13   carb     8     1

Data : 资料：

mtcars[8:11] <- lapply(mtcars[8:11], factor)

Answer 3

You should be able to do something like 你应该能够做类似的事情

by(data$x, data$y, function)

where data$x is what you want sorted, data$y is what you sort for, and function is what you want done to those entries (fx: mean, length, shapiro.test, etc). 其中data$x是您要排序的内容， data$y是您要排序的内容， function是您要对这些条目执行的操作（fx：均值，长度，shapiro.test等）。 Then you can coerce this output to a vector using as.vector() . 然后，您可以使用as.vector()将此输出强制为向量。

If I for instance have a dataframe with df <- dataframe(ID <- c(1, 1, 1, 1, 2, 2, 3), value <- (10, 20, 30, 40, 50, 60, 70)) then running as.vector(by(df$value, df$Id, lengh)) would return a vector (4, 2, 1) 例如，如果我有一个df <- dataframe(ID <- c(1, 1, 1, 1, 2, 2, 3), value <- (10, 20, 30, 40, 50, 60, 70))然后运行as.vector(by(df$value, df$Id, lengh))将返回一个向量(4, 2, 1) as.vector(by(df$value, df$Id, lengh)) (4, 2, 1)

如何在R中的数据框中按列的每个因子级别查找行数？

问题描述

3 个解决方案

解决方案1
2 2019-06-03 17:56:09

解决方案2
1 2019-06-03 18:38:56

解决方案3
0 2019-06-03 18:21:26

如何在R中的数据框中按列的每个因子级别查找行数？

问题描述

3 个解决方案

解决方案1 2 2019-06-03 17:56:09

解决方案2 1 2019-06-03 18:38:56

解决方案3 0 2019-06-03 18:21:26

解决方案1
2 2019-06-03 17:56:09

解决方案2
1 2019-06-03 18:38:56

解决方案3
0 2019-06-03 18:21:26