频率表并按r中的多个变量分组

Question

Folks, I need an elegant way of creating frequency count and group by multiple variables. 伙计们，我需要一种优雅的方式来创建频率计数并按多个变量分组。 Output should be a dataframe. 输出应该是一个数据框。 I know the answer lies somewhere in using dplyr and data.table which I am still learning. 我知道答案就在于我仍在学习使用dplyr和data.table。 I tried this link but I want to do this using dplyr and data.table. 我尝试了此链接，但我想使用dplyr和data.table进行此操作。

Here is the sample data from the same link - 这是来自同一链接的示例数据-

ID <- seq(1:177)
Age <- sample(c("0-15", "16-29", "30-44", "45-64", "65+"), 177, replace = TRUE)
Sex <- sample(c("Male", "Female"), 177, replace = TRUE)
Country <- sample(c("England", "Wales", "Scotland", "N. Ireland"), 177, replace = TRUE)
Health <- sample(c("Poor", "Average", "Good"), 177, replace = TRUE)
Survey <- data.frame(Age, Sex, Country, Health)

Here is the output I am looking for. 这是我正在寻找的输出。 Thanks and appreciate your help! 感谢并感谢您的帮助！

Answer 1

We can use dcast from data.table 我们可以使用dcast的data.table

library(data.table)
dcast(setDT(Survey), Age + Sex ~Health, value.var = "Country",
                   length)[, Total := Average + Good + Poor][]

If we don't want to type the column names, use Reduce with + 如果我们不想输入列名，请使用带有+ Reduce

dcast(setDT(Survey), Age + Sex ~Health, value.var = "Country",
                length)[, Total := Reduce(`+`, .SD), .SDcols = Average:Poor][]

Answer 2

Here is a method using data.table and tidyr but not dcast . 这是使用data.table和tidyr而不是dcast 。 First, you count observations with .N in j by the variables of interest 首先，通过关注变量对j带有.N观测值进行计数

Survey[, .N, by=.(Age, Sex, Health)]

returning: 返回：

 Age   Sex     Health   N
 30-44 Female  Average  10
 65+   Female  Poor     9
 0-15  Male    Average  3
 16-29 Male    Average  6
 30-44 Male    Good     6
 45-64 Female  Average  8

Then, use spread from tidyr to turn your column of choice into a set of new columns (one for each unique value) populated by N 然后，使用spread从tidyr把你所选择的列到由填充了一组新的列（每个唯一值） N

spread(Survey[, .N, by=.(Age, Sex, Health)], Health, N)

频率表并按r中的多个变量分组

问题描述

2 个解决方案

解决方案1
3 已采纳 2017-01-31 04:22:22

解决方案2
1 2017-01-31 05:02:30

频率表并按r中的多个变量分组

问题描述

2 个解决方案

解决方案1 3 已采纳 2017-01-31 04:22:22

解决方案2 1 2017-01-31 05:02:30

解决方案1
3 已采纳 2017-01-31 04:22:22

解决方案2
1 2017-01-31 05:02:30