[英]How to calculate the number of times a value occurs for different unique IDs in R?
I would like to calculate the number of times a zero and one occurs for each ID. 我想计算每个ID出现零和一的次数。 I have a single column with >500 unique IDs and each ID has a different number of times zeroes and ones occur. 我有一列具有> 500个唯一ID的列,每个ID出现零和1的次数不同。 Thanks! 谢谢!
I am doing this in R using for loops 我正在使用for循环在R中执行此操作
I guess something like this could help you: 我想这样的事情可以帮助您:
#Example dataframe
dummy=data.frame(ID=c(10101,11110101,11111))
#Separate every character in ID column
Sepdummy=strsplit(as.character(dummy$ID), split="")
#Count how many times a value is repeated
dummy$Zeroes=unlist(lapply(Sepdummy, function(x) sum(as.numeric(x)==0)))
dummy$Ones=unlist(lapply(Sepdummy, function(x) sum(as.numeric(x)==1)))
Output looks like: 输出如下:
ID Zeroes Ones
10101 2 3
11110101 2 6
11111 0 5
The above will not work if your ID's aren't numeric. 如果您的ID不是数字,则上述方法将无效。 For that you can just use str_count()
from stringr
package (as pointed elsewhere in this post): 对于您可以只使用str_count()
从stringr
包(如在这篇文章中尖锐其他地方):
library(stringr)
#Example dataframe
dummy=data.frame(ID=c(10101,11110101,11111,"asd0110001df"))
#Count using str_count and add the results to the original dummy dataframe, so the results are all viewed in the same table.
dummy$Zeroes=str_count(dummy$ID, "0")
dummy$Ones=str_count(dummy$ID, "1")
Within the library stringr , you have the function str_count() which enable to count the number of occurences of a character in a string. 在stringer库中,您可以使用str_count()函数来计算字符串中某个字符的出现次数。
library(stringr)
str_count("abracadabra", "a") # return 5
str_count("0010110", "0") # return 4
str_count("001d021", "0|1") # return 5
str_count(c("001", "123", "salut")) # return (3, 1, 0)
Alternative guess, maybe your data frame looks like this? 替代猜测,也许您的数据框看起来像这样?
library(dplyr)
set.seed(1)
data.df <- data.frame(id=c(rep(1,10),rep(2,10)), value=rbinom(20,1,.5))
count.df <- data.df%>%group_by(id)%>%summarize(ones=sum(value==1),zeros=sum(value==0))%>%ungroup()%>%as.data.frame
With the vector you gave above: 使用上面给出的向量:
Transform the vect to a viable dataframe: 将vect转换为可行的数据框:
data=data.frame(matrix(vect,,2,byrow=T))
with(data,table(ID,Treatment))
Treatment
ID 0 1
100a002 16 8
100a003 18 6
data: 数据:
data=read.table(text=" ID Treatment
100a002 1
100a002 0
100a002 0
100a002 0
100a002 1
100a002 1
100a002 1
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 0
100a002 1
100a002 1
100a002 1
100a002 1
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 0
100a003 1
100a003 1
100a003 1
100a003 1
100a003 1
100a003 1",h=T,stringsAsFactors=F)
Most Efficient way to do it 最有效的方法
dummy<- data.frame(id=c(rep(1,10),rep(2,10)), value=rbinom(20,1,.5))
library(data.table)
setDT(dummy)[, list(count_of_one = length(which(value==1)),count_of_zeroes = length(which(value==0))), by = id]
output 输出
id count_of_one count_of_zeroes
1: 1 5 5
2: 2 6 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.