将值出现的按行计数放入新变量中，如何使用dplyr在R中做到这一点？

Question

I have a large dataframe (df) that looks like this: 我有一个看起来像这样的大数据框（df）：

structure(list(var1 = c(1, 2, 3, 4, 2, 3, 4, 3, 2), var2 = c(2, 
3, 4, 1, 2, 1, 1, 1, 3), var3 = c(4, 4, 2, 3, 3, 1, 1, 1, 4), 
    var4 = c(2, 2, 2, 2, 3, 2, 3, 4, 1), var5 = c(4, 4, 2, 3, 
    3, 1, 1, 1, 4)), .Names = c("var1", "var2", "var3", "var4", 
"var5"), row.names = c(NA, -9L), class = "data.frame")

  var1 var2 var3 var4 var5
1    1    2    4    2    4
2    2    3    4    2    4
3    3    4    2    2    2
4    4    1    3    2    3
5    2    2    3    3    3
6    3    1    1    2    1
7    4    1    1    3    1
8    3    1    1    4    1
9    2    3    4    1    4

Now I need to count the occurence of values rowwise and make new variables of the counts. 现在，我需要按行计数值的出现并为计数创建新的变量。 This should be the result: 结果应该是：

  var1 var2 var3 var4 var5 n_1 n_2 n_3 n_4
1    1    2    4    2    4   1   2   0   2
2    2    3    4    2    4   0   2   1   2
3    3    4    2    2    2   0   3   1   1
4    4    1    3    2    3   1   1   2   1
5    2    2    3    3    3   0   2   3   0
6    3    1    1    2    1   3   1   1   0
7    4    1    1    3    1   3   0   1   1
8    3    1    1    4    1   3   0   1   1
9    2    3    4    1    4   1   1   1   2

As you can see variable n_1 shows the rowcounts of the 1's, n_2 the row counts of the 2's, etc. 如您所见，变量n_1显示1的行计数，n_2显示2的行计数，依此类推。

I tried some dplyr functions (because I like their speed), but haven't succeeded yet. 我尝试了一些dplyr函数（因为我喜欢它们的速度），但尚未成功。 I know this is definately ugly code :-), but my approache would be something in this way: 我知道这绝对是丑陋的代码:-)，但是我的方法是这样的：

newdf <- mutate(rowwise(df, n_1 = sum(df==1))

Does anyone have an idea about how to deal with this problem? 有谁知道如何处理这个问题？ Many thanks in advance! 提前谢谢了！

Answer 1

This uses rowwise() and do() from dplyr but it's definitely ugly. 它使用rowwise()和do()从dplyr但它肯定难看。

Not sure if there is something that can modify from this so that you get a data.frame output directly as shown over @ https://github.com/hadley/dplyr/releases . 不知道是否可以对此进行修改，以便直接获得data.frame输出，如@ https://github.com/hadley/dplyr/releases所示。

interim_res <- df %>% 
                  rowwise() %>% 
                  do(out = sapply(min(df):max(df), function(i) sum(i==.)))

interim_res <- interim_res[[1]] %>% do.call(rbind,.) %>% as.data.frame(.)

Then to get intended result: 然后得到预期的结果：

res <- cbind(df,interim_res)

Answer 2

This is a solution using base functions 这是使用基本功能的解决方案

dd <- t(apply(df, 1, function(x) table(factor(x, levels=1:4))))
colnames(dd) <- paste("n",1:4, sep="_")
cbind(df, dd)

Just use the table command across rows of your data.frame to get counts of each value from 1-4. 只需在data.frame各行中使用table命令即可获取1-4中每个值的计数。

Answer 3

Here is an approach using qdapTools package: 这是使用qdapTools软件包的一种方法：

library(qdapTools)

data.frame(dat, setNames(mtabulate(split(dat, id(dat))), paste0("n_", 1:4)))

##   var1 var2 var3 var4 var5 n_1 n_2 n_3 n_4
## 1    1    2    4    2    4   1   2   0   2
## 2    2    3    4    2    4   0   2   1   2
## 3    3    4    2    2    2   0   3   1   1
## 4    4    1    3    2    3   1   1   2   1
## 5    2    2    3    3    3   0   2   3   0
## 6    3    1    1    2    1   3   1   1   0
## 7    4    1    1    3    1   3   0   1   1
## 8    3    1    1    4    1   3   0   1   1
## 9    2    3    4    1    4   1   1   1   2

将值出现的按行计数放入新变量中，如何使用dplyr在R中做到这一点？

问题描述

3 个解决方案

解决方案1
3 已采纳 2014-05-24 00:51:56

解决方案2
1 2014-05-23 21:13:48

解决方案3
1 2014-05-24 01:34:44

将值出现的按行计数放入新变量中，如何使用dplyr在R中做到这一点？

问题描述

3 个解决方案

解决方案1 3 已采纳 2014-05-24 00:51:56

解决方案2 1 2014-05-23 21:13:48

解决方案3 1 2014-05-24 01:34:44

解决方案1
3 已采纳 2014-05-24 00:51:56

解决方案2
1 2014-05-23 21:13:48

解决方案3
1 2014-05-24 01:34:44