转换数据框以包含值计数

Question

Have created a dataframe that contains ids and stringvalues : 已创建一个包含id和stringvalues的数据框：

mycols <- c('id','2')
ids <- c(1,1,2,3)
stringvalues <- c('a','a','b','c')
mydf <- data.frame(ids , stringvalues)

mydf contains : mydf包含：

  ids stringvalues
1   1            a
2   1            a
3   2            b
4   3            c

I'm attempting to produce a new dataframe that contains the id and corresponding counts for each string : 我正在尝试产生一个新的数据框，其中包含每个字符串的ID和相应的计数：

id, a , b , c
1 , 2 , 0 , 0
2 , 0 , 1 , 0
3 , 0 , 0 , 1

I'm trying to create multiple summarise implementations : 我正在尝试创建多个摘要实现：

g1 <- group_by(mydf , ids)  
s1 <- summarise(g1 , a = count('a')) 
s2 <- summarise(g1 , b = count('b')) 
s3 <- summarise(g1 , c = count('c'))

But returns error : Evaluation error: no applicable method for 'groups' applied to an object of class "character". 但返回错误： Evaluation error: no applicable method for 'groups' applied to an object of class "character".

How to create new columns that count number of string entries in the column ? 如何创建新列以计算该列中的字符串条目数？

Answer 1

Does doing a dplyr::count followed by tidyr::spread work for you? 做一个dplyr::count然后是tidyr::spread是否对您tidyr::spread ？ (I'm only posting this as you mentioned you were wanting to create a dataframe of this sort - otherwise it's much simpler to use table(mydf) as the other comments/answers suggest.) （我只是按照您提到的那样发布此内容，否则您想要创建这种数据table(mydf) -否则使用table(mydf)就像其他评论/答案所建议的要简单得多。）

library(dplyr)
library(tidyr)

mydf %>% count(ids, stringvalues) %>% spread(stringvalues, n, fill = 0)

#> # A tibble: 3 x 4
#>     ids     a     b     c
#> * <dbl> <dbl> <dbl> <dbl>
#> 1     1     2     0     0
#> 2     2     0     1     0
#> 3     3     0     0     1

Answer 2

Here's a base-R solution: 这是base-R解决方案：

data.frame(cbind(table(mydf)))

Output option 1 (row # = ID): 输出选项1（行号= ID）：

Output option 2 (with ID as column): 输出选项2（ID为列）：

data.frame(cbind(id=unique(mydf$ids),table(mydf)))

  id a b c
1  1 2 0 0
2  2 0 1 0
3  3 0 0 1

Answer 3

You can use count directly. 您可以直接使用count 。 First, 第一，

count(mydf, ids,stringvalues)

gives 给

 # A tibble: 3 x 3
 ids stringvalues     n
 <dbl>       <fctr> <int>
1     1            a     2
2     2            b     1
3     3            c     1

then reshape, 然后重塑

count(mydf, ids,stringvalues) %>% tidyr::spread(stringvalues, n)

gives 给

# A tibble: 3 x 4
    ids     a     b     c
* <dbl> <int> <int> <int>
1     1     2    NA    NA
2     2    NA     1    NA
3     3    NA    NA     1

then replace the NAs with something like res[is.na(res)] <- 0 , where res is the object constructed above. 然后将NA替换为res[is.na(res)] <- 0 ，其中res是上面构造的对象。

转换数据框以包含值计数

问题描述

3 个解决方案

解决方案1
2 2017-10-11 01:12:16

解决方案2
0 2017-10-11 01:12:31

解决方案3
0 2017-10-11 01:14:26

转换数据框以包含值计数

问题描述

3 个解决方案

解决方案1 2 2017-10-11 01:12:16

解决方案2 0 2017-10-11 01:12:31

解决方案3 0 2017-10-11 01:14:26

解决方案1
2 2017-10-11 01:12:16

解决方案2
0 2017-10-11 01:12:31

解决方案3
0 2017-10-11 01:14:26